Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments

Size: px
Start display at page:

Download "Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments"

Transcription

1 Plnning to Be Surprised: Optiml Byesin Explortion in Dynmic Environments Yi Sun, Fustino Gomez, nd Jürgen Schmidhuber IDSIA, Glleri 2, Mnno, CH-6928, Switzerlnd Abstrct. To mximize its success, n AGI typiclly needs to explore its initilly unknown world. Is there n optiml wy of doing so? Here we derive n ffirmtive nswer for brod clss of environments. 1 Introduction An intelligent gent is sent to explore n unknown environment. Over the course of its mission, the gent mkes observtions, crries out ctions, nd incrementlly builds up model of the environment from this interction. Since the wy in which the gent selects ctions my gretly ffect the efficiency of the explortion, the following question nturlly rises: How should the gent choose the ctions such tht the knowledge bout the environment ccumultes s quickly s possible? In this pper, this question is ddressed under clssicl frmework in which the gent improves its model of the environment through probbilistic inference, nd lerning progress is mesured in terms of Shnnon informtion gin. We show tht the gent cn, t lest in principle, optimlly choose ctions bsed on previous experiences, such tht the cumultive expected informtion gin is mximized. The rest of the pper is orgnized s follows: Section 2 reviews the bsic concepts nd estblishes the terminology; Section 3 elbortes the principle of optiml Byesin explortion; Section 4 presents simple experiment; Relted work is briefly reviewed in Section 5; Section 6 concludes the pper. 2 Preliminries Suppose tht the gent intercts with the environment in discrete time cycles t = 1, 2,.... In ech cycle, the gent performs n ction,, then receives sensory input, o. A history, h, is either the empty string,, or string of the form 1 o 1 t o t for some t, nd h nd ho refer to the strings resulting from ppending nd o to h, respectively. 2.1 Lerning from Sequentil Interctions To fcilitte the subsequent discussion under probbilistic frmework, we mke the following ssumptions:

2 Assumption I. The models of the environment under considertion re fully described by rndom element Θ which depends solely on the environment. Moreover, the gent s initil knowledge bout Θ is summrized by prior density p (θ). Assumption II. The gent is equipped with conditionl predictor p (o h; θ), i.e. the gent is cpble of refining its prediction in the light of informtion bout Θ. Using p (θ) nd p (o h; θ) s building blocks, it is strightforwrd to formulte lerning in terms of probbilistic inference. From Assumption I, given the history h, the gent s knowledge bout Θ is fully summrized by p (θ h). According to Byes rule, p (θ ho) = p(θ h)p(o h;θ) p(o h), with p (o h) = p (o h, θ) p (θ h) dθ. The term p (θ h) represents the gent s current knowledge bout Θ given history h nd n dditionl ction. Since Θ depends solely on the environment, nd, importntly, knowing the ction without subsequent observtions cnnot chnge the gent s stte of knowledge bout Θ, then p (θ h) = p (θ h), nd thus the knowledge bout Θ cn be updted using p (θ ho) = p (θ h) p (o h; θ) p (o h). (1) It is worth pointing out tht p (o h; θ) is chosen before entering the environment. It is not required tht it mtch the true dynmics of the environment, but the effectiveness of the lerning certinly depends on the choices of p (o h; θ). For exmple, if Θ R, nd p (o h; θ) depends on θ only through its sign, then no knowledge other thn the sign of Θ cn be lerned. 2.2 Informtion Gin s Lerning Progress Let h nd h be two histories such tht h is prefix of h. The respective posterior distributions of Θ re p (θ h) nd p (θ h ). Using h s reference point, the mount of informtion gined when the history grows to h cn be mesured using the KL divergence between p (θ h) nd p (θ h ). This informtion gin from h to h is defined s g(h h) = KL (p (θ h ) p (θ h)) = p (θ h ) log p (θ h ) p (θ h) dθ. As specil cse, if h =, then g (h ) = g (h ) is the cumultive informtion gin with respect to the prior p (θ). We lso write g (o h) for g (ho h), which denotes the informtion gined from n dditionl ction-observtion pir. From n informtion theoretic point of view, the KL divergence between two distributions p nd q represents the dditionl number of bits required to encode elements smpled from p, using optiml coding strtegy designed for q. This cn be interpreted s the degree of unexpectedness or surprise cused by observing smples from p when expecting smples from q. The key property informtion gin for the tretment below is the following decomposition: Let h be prefix of h nd h be prefix of h, then E h h g (h h) = g (h h) + E h h g (h h ). (2)

3 Tht is, the informtion gin is dditive in expecttion. Hving defined the informtion gin from trjectories ending with observtions, one my proceed to define the expected informtion gin of performing ction, before observing the outcome o. Formlly, the expected informtion gin of performing with respect to the current history h is given by ḡ ( h) = E o h g (o h). A simple derivtion gives ḡ ( h) = o p (o, θ h) log p (o, θ h) dθ = I (O; Θ h), p (θ h) p (o h) which mens tht ḡ ( h) is the mutul informtion between Θ nd the rndom vrible O representing the unknown observtion, conditioned on the history h nd ction. 3 Optiml Byesin Explortion In this section, the generl principle of optiml Byesin explortion in dynmic environments is presented. We first give results obtined by ssuming fixed limited life spn for our gent, then discuss condition required to extend this to infinite time horizons. 3.1 Results for Finite Time Horizon Suppose tht the gent hs experienced history h, nd is bout to choose τ more ctions in the future. Let π be policy mpping the set of histories to the set of ctions, such tht the gent performs with probbility π ( h) given h. Define the curiosity Q-vlue q τ π (h, ) s the expected informtion gined from the dditionl τ ctions, ssuming tht the gent performs in the next step nd follows policy π in the remining τ 1 steps. Formlly, for τ = 1, nd for τ > 1, q 1 π (h, ) = E o h g (o h) = ḡ ( h), q τ π (h, ) = E o h E 1 hoe o1 ho 1 E oτ 1 h τ 1 g (ho 1 o 1 τ 1 o τ 1 h) = E o h E 1o 1 τ 1o τ 1 hog (ho 1 o 1 τ 1 o τ 1 h). The curiosity Q-vlue cn be defined recursively. Applying Eq. 2 for τ = 2, And for τ > 2, q 2 π (h, ) = E o h E 1o 1 hog (ho 1 o 1 h) = E o h [ g (o h) + E1o 1 hog ( 1 o 1 ho) ] = ḡ ( h) + E o h E hoq 1 π (ho, ). q τ π (h, ) = E o h E 1o 1 τ 1o τ 1 hog (ho 1 o 1 τ 1 o τ 1 h) = E o h [ g (o h) + E1o 1 τ 1o τ 1 g (ho 1 o 1 τ 1 o τ 1 ho) ] = ḡ ( h) + E o h E hoq τ 1 π (ho, ). (3)

4 Noting tht Eq.3 bers gret resemblnce to the definition of stte-ction vlues (Q(s, )) in reinforcement lerning, one cn similrly define the curiosity vlue of prticulr history s v τ π (h) = E h q τ π (h, ), nlogous to stte vlues (V (s)), which cn lso be itertively defined s v 1 π (h) = E h ḡ ( h), nd v τ π (h) = E h ( h) + Eo h v τ 1 π (ho) ]. The curiosity vlue v τ π (h) is the expected informtion gin of performing the dditionl τ steps, ssuming tht the gent follows policy π. The two nottions cn be combined to write q τ π (h, ) = ḡ ( h) + E o h v τ 1 π (ho). (4) This eqution hs n interesting interprettion: since the gent is operting in dynmic environment, it hs to tke into ccount not only the immedite expected informtion gin of performing the current ction, i.e., ḡ ( h), but lso the expected curiosity vlue of the sitution in which the gent ends up due to the ction, i.e., vπ τ 1 (ho). As consequence, the gent needs to choose ctions tht blnce the two fctors in order to improve its totl expected informtion gin. Now we show tht there is optiml policy π, which leds to the mximum cumultive expected informtion gin given ny history h. To obtin the optiml policy, one my work bckwrds in τ, tking greedy ctions with respect to the curiosity Q-vlues t ech time step. Nmely, for τ = 1, let q 1 (h, ) = ḡ ( h), π 1 (h) = rg mx ḡ ( h), nd v 1 (h) = mx ḡ ( h), such tht v 1 (h) = q ( 1 h, π 1 (h) ), nd for τ > 1, let ] q τ (h, ) = ḡ ( h) + E o h [mx q τ 1 ( ho) = ḡ ( h) + E o h v τ 1 (ho), with π τ (h) = rg mx q τ (h, ) nd v τ (h) = mx q τ (h, ). We show tht π τ (h) is indeed the optiml policy for ny given τ nd h in the sense tht the curiosity vlue, when following π τ, is mximized. To see this, tke ny other strtegy π, first notice tht v 1 (h) = mx ḡ ( h) E h ḡ ( h) = vπ 1 (h). Moreover, ssuming v τ (h) v τ π (h), v τ+1 (h) = mx ( h) + Eo h v τ (ho) ] mx E h ( h) + Eo h v τ π (ho) ] = v τ+1 π (h). ( h) + Eo h v τ π (ho) ] Therefore v τ (h) vπ τ (h) holds for rbitrry τ, h, nd π. The sme cn be shown for curiosity Q-vlues, nmely, q τ (h, ) qπ τ (h, ), for ll τ, h,, nd π. Now consider tht the gent hs fixed life spn T. It cn be seen tht t time t, the gent hs to perform π T t (h t 1 ) to mximize the expected informtion gin in the remining T t steps. Here h t 1 = 1 o 1 t 1 o t 1 is the history t time t. However, from Eq.2, E ht h t 1 g (h T ) = g (h t 1 ) + E ht h t 1 g (h T h t 1 ).

5 Note tht t time t, g (h t 1 ) is constnt, thus mximizing the cumultive expected informtion gin in the remining time steps is equivlent to mximizing the expected informtion gin of the whole trjectory with respect to the prior. The result is summrized in the following proposition: Proposition 1. Let q 1 (h, ) = ḡ ( h), v 1 (h) = mx q 1 (h, ), nd q τ (h, ) = ḡ ( h) + E o h v τ 1 (ho), v τ (h) = mx q τ (h, ), then the policy π τ (h) = rg mx q τ (h, ) is optiml in the sense tht v τ (h) vπ τ (h), q τ (h, ) qπ τ (h, ) for ny π, τ, h nd. In prticulr, for n gent with fixed life spn T, following π T t (h t 1 ) t time t = 1,..., T is optiml in the sense tht the expected cumultive informtion gin with respect to the prior is mximized. The definition of the optiml explortion policy is constructive, which mens tht it cn be redily implemented, provided tht the number of ctions nd possible observtions is finite so tht the expecttion nd mximiztion cn be computed exctly. However, the cost of computing such policy is O ((n o n ) τ ), where n o nd n re the number of possible observtions nd ctions, respectively. Since the cost is exponentil on τ, plnning with lrge number of look hed steps is infesible, nd pproximtion heuristics must be used in prctice. 3.2 Non-trivility of the Result Intuitively, the recursive definition of the curiosity (Q) vlue is simple, nd bers cler resemblnce to its counterprt in reinforcement lerning. It might be tempting to think tht the result is nothing more thn solving the finite horizon reinforcement lerning problem using ḡ ( h) or g (o h) s the rewrd signls. However, this is not the cse. First, note tht the decomposition Eq.2 is direct consequence of the formultion of the KL divergence. The decomposition does not necessrily hold if g (h) is replced with other types of mesures of informtion gin. Second, it is worth pointing out tht g (o h) nd ḡ ( h) behve differently from norml rewrd signls in the sense tht they re dditive only in expecttion, while in the reinforcement lerning setup, the rewrd signls re usully ssumed to be dditive, i.e., dding rewrd signls together is lwys meningful. Consider simple problem with only two ctions. If g (o h) is plin rewrd function, then g (o h)+g ( o ho) should be meningful, no mtter if nd o is known or not. But this is not the cse, since the sum does not hve vlid informtion theoretic interprettion. On the other hnd, the sum is meningful in expecttion. Nmely, when o hs not been observed, from Eq.2, g (o h) + E o ho g ( o ho) = E o ho g (o o h), the sum cn be interpreted s the expecttion of the informtion gined from h to ho o. This result shows tht g (o h) nd ḡ ( h) cn be treted s dditive rewrd signls only when one is plnning hed. To emphsize the difference further, note tht ll immedite informtion gins g (o h) re non-negtive since they re essentilly KL divergence. A nturl ssumption would be tht the informtion gin g (h), which is the sum of

6 14 12 Sum of immedite informtion gin Cumultive informtion gin w.r.t. prior 10 KL divergence number of smples Fig. 1. Illustrtion of the difference between the sum of one-step informtion gin nd the cumultive informtion gin with respect to the prior. In this cse, 1000 independent smples re generted from distribution over finite smple spce {1, 2, 3}, with p (x = 1) = 0.1, p (x = 2) = 0.5, nd p (x = 3) = 0.4. The tsk of lerning is to recover the mss function from the smples, ssuming Dirichlet prior Dir ` 50, 50, The KL divergence between two Dirichlet distributions re computed ccording to [5]. It is cler from the grph tht the cumultive informtion gin fluctutes when the number of smples increses, while the sum of the one-step informtion gin increses monotoniclly. It lso shows tht the difference between the two quntities cn be lrge. ll g (o h) in expecttion, grows monotoniclly when the length of the history increses. However, this is not the cse, see Figure 1 for exmple. Although g (o h) is lwys non-negtive, some of the gin my pull θ closer to its prior density p (θ), resulting in decrese of KL divergence between p (θ h) nd p (θ). This is never the cse if one considers the norml rewrd signls in reinforcement lerning, where the ccumulted rewrd would never decrese if ll rewrds re non-negtive. 3.3 Extending to Infinite Horizon Hving to restrict the mximum life spn of the gent is rther inconvenient. It is tempting to define the curiosity Q-vlue in the infinite time horizon cse s the limit of curiosity Q-vlues with incresing life spns, T. However, this cnnot be chieved without dditionl technicl constrints. For exmple, consider simple coin tossing. Assuming Bet (1, 1) over the probbility of seeing heds, then the expected cumultive informtion gin for the next T flips is given by v T (h 1 ) = I (Θ; X 1,..., X T ) log T. With incresing T, v T (h 1 ). A frequently used pproch to simplifying the mth is to introduce discount fctor γ (0 γ < 1), s used in reinforcement lerning. Assume tht the gent hs mximum τ ctions left, but before finishing the τ ctions it my be forced to leve the environment with probbility 1 γ t ech time step. In this cse, the curiosity Q-vlue becomes

7 qπ γ,1 (h, ) = ḡ ( h), nd q γ,τ π (h, ) = (1 γ) ḡ ( h) + γ [ ḡ ( h) + E o h E hoq γ,τ 1 π (ho, ) ] = ḡ ( h) + γe o h E hoq γ,τ 1 π (ho, ). One my lso interpret qπ γ,τ (h, ) s liner combintion of curiosity Q-vlues without the discount, q γ,τ π (h, ) = (1 γ) τ γ t 1 qπ t (h, ) + γ τ qπ τ (h, ). t=1 Note tht curiosity Q-vlues with lrger look-hed steps re weighed exponentilly less. The optiml policy in the discounted cse is given by nd q γ,1 (h, ) = ḡ ( h), v γ,1 (h) = mx q γ,1 (h, ), q γ,τ (h, ) = ḡ ( h) + γe o h v γ,τ 1 (ho), v γ,τ (h) = mx q γ,τ (h, ). The optiml ctions re given by π γ,τ (h) = rg mx q γ,τ (h, ). The proof tht is optiml is similr to the one for the finite horizon cse (section 3.1) nd π γ,τ thus is omitted here. Adding the discount enbles one to define the curiosity Q-vlue in infinite time horizon in number of cses. However, it is still possible to construct scenrios where such discount fils. Consider infinite list of bndits. For bndit n, there re n possible outcomes with Dirichlet prior Dir ( 1 n,..., n) 1. The expected informtion gin of pulling bndit n for the first time is then given by ( log n ψ (2) + log ) log n, n with ψ( ) being the digmm function. Assume t time t, only the first e e2t bndits re vilble, thus the curiosity Q-vlue in finite time horizon is lwys finite. However, since the lrgest expected informtion gin grows t speed e t2, for ny given γ > 0, q γ,τ goes to infinity with incresing τ. This exmple gives the intuition tht to mke the curiosity Q-vlue meningful, the totl informtion content of the environment (or its growing speed) must be bounded. The following technicl Lemm gives sufficient condition for when such extension is meningful. Lemm 1. We hve 0 q γ,τ+1 (h, ) q γ,τ (h, ) γ τ E o h mx E o1 ho 1 mx ḡ ( τ h o τ 1 ). 1 τ

8 Proof. Expnd q γ,τ nd q γ,τ+1, nd note tht mx X mx Y mx X Y, then q γ,τ+1 π (h, ) qπ γ,τ (h, ) = E o h mx 1 E o1 ho 1 mx τ ( h) + γḡ ( 1 ho) + + γ τ ḡ ( τ h o τ 1 )] E o h mx E o1 ho 1 mx ( h) + γḡ (1 ho) + + γ τ 1 ḡ ( τ 1 h o τ 2 ) ] 1 τ 1 E o h mx{e o1 ho 1 mx ( h) + γḡ ( 1 ho) + + γ τ ḡ ( τ h o τ 1 )] 1 τ E o1 ho 1 mx τ 1 ( h) + γḡ (1 ho) + + γ τ 1 ḡ ( τ 1 h o τ 2 ) ] } γ τ E o h mx E o1 ho 1 mx ḡ ( τ h o τ 1 ). 1 τ It cn be seen tht if E o1 o τ 1 τ hḡ ( τ h o τ 1 ) grows sub-exponentilly, then qπ γ,τ is Cuchy sequence, nd it mkes sense to define the curiosity Q-vlue for infinite time horizon. 4 Experiment The ide presented in the previous section is illustrted through simple experiment. The environment is n MDP consisting of two groups of densely connected sttes (cliques) linked by long corridor. The gent hs two ctions llowing it to move long the corridor deterministiclly, wheres the trnsition probbilities inside ech clique re rndomly generted. The gent ssumes Dirichlet priors over ll trnsition probbilities, nd the gol is to lern the trnsition model of the MDP. In the experiment, ech clique consists of 5 sttes, (sttes 1 to 5 nd sttes 56 to 60), nd the corridor is of length 50 (sttes 6 to 55). The prior over ech trnsition probbility is Dir ( 1 60,..., 60) 1. We compre four different lgorithms: i) rndom explortion, where the gent selects ech of the two ctions with equl probbility t ech time step; ii) Q- lerning with the immedite informtion gin g (o h) s the rewrd; iii) greedy explortion, where the gent chooses t ech time step the ction mximizing ḡ ( h); nd iv) dynmic-progrmming (DP) pproximtion of the optiml Byesin explortion, where t ech time step the gent follows policy which is computed using policy itertion, ssuming tht the dynmics of the MDP is given by the current posterior, nd the rewrd is the expected informtion gin ḡ ( h). The detil of this lgorithm is described in [11]. Fig.2 shows the typicl behvior of the four lgorithms. The upper four plots show how the gent moves in the MDP strting from one clique. Both greedy explortion nd DP move bck nd forth between the two cliques. Rndom explortion hs difficulty moving between the two cliques due to the rndom wlk behvior in the corridor. Q-lerning explortion, however, gets stuck in the initil clique. The reson for is tht since the jump on the corridor is deterministic, the informtion gin decreses to virtully zero fter only severl ttempts, therefore the Q-vlue of jumping into the corridor becomes much lower thn the

9 Rndom stte stte Q lerning Greedy stte stte DP cum. info. gin Rndom Q lerning Greedy DP Time Fig. 2. The explortion process of typicl run of 4000 steps. The upper four plots shows the position of the gent between stte 1 (the lowest) nd 60 (the highest). The sttes t the top nd the bottom correspond to the two cliques, nd the sttes in the middle correspond to the corridor. The lowest plot is the cumultive informtion gin with respect to the prior. Q-vlue of jumping inside the clique. The bottom plot shows how the cumultive informtion gin grows over time, nd how the DP pproximtion clerly outperforms the other lgorithms, prticulrly in the erly phse of explortion. 5 Relted Work The ide of ctively selecting queries to ccelerte lerning process hs long history [1, 2, 7], nd hs received lot of ttention in recent decdes, primrily in the context of ctive lerning [8] nd rtificil curiosity [6]. In prticulr, mesuring lerning progress using KL divergence dtes bck to the 50 s [2, 4]. In

10 1995 this ws combined with reinforcement lerning, with the gol of optimizing future expected informtion gin [10]. Others renmed this Byesin surprise [3]. Our work differs from most previous work in two min points: First, like in [10], we consider the problem of exploring dynmic environment, where ctions chnge the environmentl stte, while most work on ctive lerning nd Byesin experiment design focuses on queries tht do not ffect the environment [8]. Second, our result is theoreticlly sound nd directly derived from first principles, in contrst to the more heuristic ppliction [10] of trditionl reinforcement lerning to mximize the expected informtion gin. In prticulr, we pointed out previously neglected subtlety of using KL divergence s lerning progress. Conceptully, however, this work is closely connected to rtificil curiosity nd intrinsiclly motivted reinforcement lerning [6,7,9] for gents tht ctively explore the environment without n externl rewrd signl. In fct, the very definition of the curiosity (Q) vlue permits firm connection between pure explortion nd reinforcement lerning. 6 Conclusion We hve presented the principle of optiml Byesin explortion in dynmic environments, centered round the concept of the curiosity (Q) vlue. Our work provides theoreticlly sound foundtion for designing more effective explortion strtegies. Future work will concentrte on studying the theoreticl properties of vrious pproximtion strtegies inspired by this principle. 7 Acknowledgement This reserch ws funded in prt by Swiss Ntionl Science Foundtion grnt /1, nd the EU IM-CLeVeR project (#231722). References 1. Chloner, K., Verdinelli, I.: Byesin experimentl design: A review. Sttisticl Science 10, (1995) 2. Fedorov, V.V.: Theory of optiml experiments. Acdemic Press (1972) 3. Itti, L., Bldi, P.F.: Byesin surprise ttrcts humn ttention. In: NIPS 05. pp (2006) 4. Lindley, D.V.: On mesure of the informtion provided by n experiment. Annls of Mthemticl Sttistics 27(4), (1956) 5. Penny, W.: Kullbck-liebler divergences of norml, gmm, dirichlet nd wishrt densities. Tech. rep., Wellcome Deprtment of Cognitive Neurology, University College London (2001) 6. Schmidhuber, J.: Curious model-building control systems. In: IJCNN 91. vol. 2, pp (1991) 7. Schmidhuber, J.: Forml theory of cretivity, fun, nd intrinsic motivtion ( ). Autonomous Mentl Development, IEEE Trns. on Autonomous Mentl Development 2(3), (9 2010) 8. Settles, B.: Active lerning literture survey. Tech. rep., University of Wisconsin Mdison (2010) 9. Singh, S., Brto, A., Chentnez, N.: Intrinsiclly motivted reinforcement lerning. In: NIPS 04 (2004) 10. Storck, J., Hochreiter, S., Schmidhuber, J.: Reinforcement driven informtion cquisition in non-deterministic environments. In: ICANN 95 (1995) 11. Sun, Y., Gomez, F.J., Schmidhuber, J.: Plnning to be surprised: Optiml byesin explortion in dynmic environments (2011),

Reinforcement Learning

Reinforcement Learning Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm

More information

Reinforcement learning II

Reinforcement learning II CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic

More information

2D1431 Machine Learning Lab 3: Reinforcement Learning

2D1431 Machine Learning Lab 3: Reinforcement Learning 2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed

More information

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7 CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

Administrivia CSE 190: Reinforcement Learning: An Introduction

Administrivia CSE 190: Reinforcement Learning: An Introduction Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these

More information

Recitation 3: More Applications of the Derivative

Recitation 3: More Applications of the Derivative Mth 1c TA: Pdric Brtlett Recittion 3: More Applictions of the Derivtive Week 3 Cltech 2012 1 Rndom Question Question 1 A grph consists of the following: A set V of vertices. A set E of edges where ech

More information

Multi-Armed Bandits: Non-adaptive and Adaptive Sampling

Multi-Armed Bandits: Non-adaptive and Adaptive Sampling CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent

More information

Bellman Optimality Equation for V*

Bellman Optimality Equation for V* Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior Reversls of Signl-Posterior Monotonicity for Any Bounded Prior Christopher P. Chmbers Pul J. Hely Abstrct Pul Milgrom (The Bell Journl of Economics, 12(2): 380 391) showed tht if the strict monotone likelihood

More information

Non-Linear & Logistic Regression

Non-Linear & Logistic Regression Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University) Regression Anlyses When do we use these? PART 1: find

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model:

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model: 1 2 MIXED MODELS (Sections 17.7 17.8) Exmple: Suppose tht in the fiber breking strength exmple, the four mchines used were the only ones of interest, but the interest ws over wide rnge of opertors, nd

More information

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a). The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples

More information

THE EXISTENCE OF NEGATIVE MCMENTS OF CONTINUOUS DISTRIBUTIONS WALTER W. PIEGORSCH AND GEORGE CASELLA. Biometrics Unit, Cornell University, Ithaca, NY

THE EXISTENCE OF NEGATIVE MCMENTS OF CONTINUOUS DISTRIBUTIONS WALTER W. PIEGORSCH AND GEORGE CASELLA. Biometrics Unit, Cornell University, Ithaca, NY THE EXISTENCE OF NEGATIVE MCMENTS OF CONTINUOUS DISTRIBUTIONS WALTER W. PIEGORSCH AND GEORGE CASELLA. Biometrics Unit, Cornell University, Ithc, NY 14853 BU-771-M * December 1982 Abstrct The question of

More information

Riemann is the Mann! (But Lebesgue may besgue to differ.)

Riemann is the Mann! (But Lebesgue may besgue to differ.) Riemnn is the Mnn! (But Lebesgue my besgue to differ.) Leo Livshits My 2, 2008 1 For finite intervls in R We hve seen in clss tht every continuous function f : [, b] R hs the property tht for every ɛ >

More information

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as Improper Integrls Two different types of integrls cn qulify s improper. The first type of improper integrl (which we will refer to s Type I) involves evluting n integrl over n infinite region. In the grph

More information

Math 8 Winter 2015 Applications of Integration

Math 8 Winter 2015 Applications of Integration Mth 8 Winter 205 Applictions of Integrtion Here re few importnt pplictions of integrtion. The pplictions you my see on n exm in this course include only the Net Chnge Theorem (which is relly just the Fundmentl

More information

Chapter 5 : Continuous Random Variables

Chapter 5 : Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll

More information

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 UNIFORM CONVERGENCE Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 Suppose f n : Ω R or f n : Ω C is sequence of rel or complex functions, nd f n f s n in some sense. Furthermore,

More information

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one

More information

Chapter 0. What is the Lebesgue integral about?

Chapter 0. What is the Lebesgue integral about? Chpter 0. Wht is the Lebesgue integrl bout? The pln is to hve tutoril sheet ech week, most often on Fridy, (to be done during the clss) where you will try to get used to the ides introduced in the previous

More information

{ } = E! & $ " k r t +k +1

{ } = E! & $  k r t +k +1 Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,

More information

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004 Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when

More information

Chapter 4: Dynamic Programming

Chapter 4: Dynamic Programming Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,

More information

Improper Integrals, and Differential Equations

Improper Integrals, and Differential Equations Improper Integrls, nd Differentil Equtions October 22, 204 5.3 Improper Integrls Previously, we discussed how integrls correspond to res. More specificlly, we sid tht for function f(x), the region creted

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

Mapping the delta function and other Radon measures

Mapping the delta function and other Radon measures Mpping the delt function nd other Rdon mesures Notes for Mth583A, Fll 2008 November 25, 2008 Rdon mesures Consider continuous function f on the rel line with sclr vlues. It is sid to hve bounded support

More information

Continuous Random Variables

Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht

More information

New Expansion and Infinite Series

New Expansion and Infinite Series Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

Entropy and Ergodic Theory Notes 10: Large Deviations I

Entropy and Ergodic Theory Notes 10: Large Deviations I Entropy nd Ergodic Theory Notes 10: Lrge Devitions I 1 A chnge of convention This is our first lecture on pplictions of entropy in probbility theory. In probbility theory, the convention is tht ll logrithms

More information

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007 A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H Thoms Shores Deprtment of Mthemtics University of Nebrsk Spring 2007 Contents Rtes of Chnge nd Derivtives 1 Dierentils 4 Are nd Integrls 5 Multivrite Clculus

More information

LECTURE NOTE #12 PROF. ALAN YUILLE

LECTURE NOTE #12 PROF. ALAN YUILLE LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of

More information

MAA 4212 Improper Integrals

MAA 4212 Improper Integrals Notes by Dvid Groisser, Copyright c 1995; revised 2002, 2009, 2014 MAA 4212 Improper Integrls The Riemnn integrl, while perfectly well-defined, is too restrictive for mny purposes; there re functions which

More information

Math 426: Probability Final Exam Practice

Math 426: Probability Final Exam Practice Mth 46: Probbility Finl Exm Prctice. Computtionl problems 4. Let T k (n) denote the number of prtitions of the set {,..., n} into k nonempty subsets, where k n. Argue tht T k (n) kt k (n ) + T k (n ) by

More information

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction Lesson : Logrithmic Functions s Inverses Prerequisite Skills This lesson requires the use of the following skills: determining the dependent nd independent vribles in n exponentil function bsed on dt from

More information

Estimation of Binomial Distribution in the Light of Future Data

Estimation of Binomial Distribution in the Light of Future Data British Journl of Mthemtics & Computer Science 102: 1-7, 2015, Article no.bjmcs.19191 ISSN: 2231-0851 SCIENCEDOMAIN interntionl www.sciencedomin.org Estimtion of Binomil Distribution in the Light of Future

More information

4 The dynamical FRW universe

4 The dynamical FRW universe 4 The dynmicl FRW universe 4.1 The Einstein equtions Einstein s equtions G µν = T µν (7) relte the expnsion rte (t) to energy distribution in the universe. On the left hnd side is the Einstein tensor which

More information

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize

More information

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique? XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk bout solving systems of liner equtions. These re problems tht give couple of equtions with couple of unknowns, like: 6 2 3 7 4

More information

5.7 Improper Integrals

5.7 Improper Integrals 458 pplictions of definite integrls 5.7 Improper Integrls In Section 5.4, we computed the work required to lift pylod of mss m from the surfce of moon of mss nd rdius R to height H bove the surfce of the

More information

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance Generl structure ECO 37 Economics of Uncertinty Fll Term 007 Notes for lectures 4. Stochstic Dominnce Here we suppose tht the consequences re welth mounts denoted by W, which cn tke on ny vlue between

More information

APPROXIMATE INTEGRATION

APPROXIMATE INTEGRATION APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be

More information

Review of basic calculus

Review of basic calculus Review of bsic clculus This brief review reclls some of the most importnt concepts, definitions, nd theorems from bsic clculus. It is not intended to tech bsic clculus from scrtch. If ny of the items below

More information

1B40 Practical Skills

1B40 Practical Skills B40 Prcticl Skills Comining uncertinties from severl quntities error propgtion We usully encounter situtions where the result of n experiment is given in terms of two (or more) quntities. We then need

More information

Lecture 14: Quadrature

Lecture 14: Quadrature Lecture 14: Qudrture This lecture is concerned with the evlution of integrls fx)dx 1) over finite intervl [, b] The integrnd fx) is ssumed to be rel-vlues nd smooth The pproximtion of n integrl by numericl

More information

S. S. Dragomir. 2, we have the inequality. b a

S. S. Dragomir. 2, we have the inequality. b a Bull Koren Mth Soc 005 No pp 3 30 SOME COMPANIONS OF OSTROWSKI S INEQUALITY FOR ABSOLUTELY CONTINUOUS FUNCTIONS AND APPLICATIONS S S Drgomir Abstrct Compnions of Ostrowski s integrl ineulity for bsolutely

More information

Infinite Geometric Series

Infinite Geometric Series Infinite Geometric Series Finite Geometric Series ( finite SUM) Let 0 < r < 1, nd let n be positive integer. Consider the finite sum It turns out there is simple lgebric expression tht is equivlent to

More information

Credibility Hypothesis Testing of Fuzzy Triangular Distributions

Credibility Hypothesis Testing of Fuzzy Triangular Distributions 666663 Journl of Uncertin Systems Vol.9, No., pp.6-74, 5 Online t: www.jus.org.uk Credibility Hypothesis Testing of Fuzzy Tringulr Distributions S. Smpth, B. Rmy Received April 3; Revised 4 April 4 Abstrct

More information

1 1D heat and wave equations on a finite interval

1 1D heat and wave equations on a finite interval 1 1D het nd wve equtions on finite intervl In this section we consider generl method of seprtion of vribles nd its pplictions to solving het eqution nd wve eqution on finite intervl ( 1, 2. Since by trnsltion

More information

Jack Simons, Henry Eyring Scientist and Professor Chemistry Department University of Utah

Jack Simons, Henry Eyring Scientist and Professor Chemistry Department University of Utah 1. Born-Oppenheimer pprox.- energy surfces 2. Men-field (Hrtree-Fock) theory- orbitls 3. Pros nd cons of HF- RHF, UHF 4. Beyond HF- why? 5. First, one usully does HF-how? 6. Bsis sets nd nottions 7. MPn,

More information

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature CMDA 4604: Intermedite Topics in Mthemticl Modeling Lecture 19: Interpoltion nd Qudrture In this lecture we mke brief diversion into the res of interpoltion nd qudrture. Given function f C[, b], we sy

More information

Bernoulli Numbers Jeff Morton

Bernoulli Numbers Jeff Morton Bernoulli Numbers Jeff Morton. We re interested in the opertor e t k d k t k, which is to sy k tk. Applying this to some function f E to get e t f d k k tk d k f f + d k k tk dk f, we note tht since f

More information

How to simulate Turing machines by invertible one-dimensional cellular automata

How to simulate Turing machines by invertible one-dimensional cellular automata How to simulte Turing mchines by invertible one-dimensionl cellulr utomt Jen-Christophe Dubcq Déprtement de Mthémtiques et d Informtique, École Normle Supérieure de Lyon, 46, llée d Itlie, 69364 Lyon Cedex

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)

More information

Decision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information

Decision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information CS 188: Artificil Intelligence nd Vlue of Informtion Instructors: Dn Klein nd Pieter Abbeel niversity of Cliforni, Berkeley [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI

More information

The Wave Equation I. MA 436 Kurt Bryan

The Wave Equation I. MA 436 Kurt Bryan 1 Introduction The Wve Eqution I MA 436 Kurt Bryn Consider string stretching long the x xis, of indeterminte (or even infinite!) length. We wnt to derive n eqution which models the motion of the string

More information

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom Lerning Gols Continuous Rndom Vriles Clss 5, 8.05 Jeremy Orloff nd Jonthn Bloom. Know the definition of continuous rndom vrile. 2. Know the definition of the proility density function (pdf) nd cumultive

More information

Bayesian Networks: Approximate Inference

Bayesian Networks: Approximate Inference pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,

More information

CS 188: Artificial Intelligence Fall 2010

CS 188: Artificial Intelligence Fall 2010 CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1 Decision Networks ME: choose the ction which mximizes the expected utility given

More information

Local orthogonality: a multipartite principle for (quantum) correlations

Local orthogonality: a multipartite principle for (quantum) correlations Locl orthogonlity: multiprtite principle for (quntum) correltions Antonio Acín ICREA Professor t ICFO-Institut de Ciencies Fotoniques, Brcelon Cusl Structure in Quntum Theory, Bensque, Spin, June 2013

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 2013 Outline 1 Riemnn Sums 2 Riemnn Integrls 3 Properties

More information

221B Lecture Notes WKB Method

221B Lecture Notes WKB Method Clssicl Limit B Lecture Notes WKB Method Hmilton Jcobi Eqution We strt from the Schrödinger eqution for single prticle in potentil i h t ψ x, t = [ ] h m + V x ψ x, t. We cn rewrite this eqution by using

More information

CHM Physical Chemistry I Chapter 1 - Supplementary Material

CHM Physical Chemistry I Chapter 1 - Supplementary Material CHM 3410 - Physicl Chemistry I Chpter 1 - Supplementry Mteril For review of some bsic concepts in mth, see Atkins "Mthemticl Bckground 1 (pp 59-6), nd "Mthemticl Bckground " (pp 109-111). 1. Derivtion

More information

Arithmetic & Algebra. NCTM National Conference, 2017

Arithmetic & Algebra. NCTM National Conference, 2017 NCTM Ntionl Conference, 2017 Arithmetic & Algebr Hether Dlls, UCLA Mthemtics & The Curtis Center Roger Howe, Yle Mthemtics & Texs A & M School of Eduction Relted Common Core Stndrds First instnce of vrible

More information

Numerical integration

Numerical integration 2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter

More information

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:

More information

Frobenius numbers of generalized Fibonacci semigroups

Frobenius numbers of generalized Fibonacci semigroups Frobenius numbers of generlized Fiboncci semigroups Gretchen L. Mtthews 1 Deprtment of Mthemticl Sciences, Clemson University, Clemson, SC 29634-0975, USA gmtthe@clemson.edu Received:, Accepted:, Published:

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 203 Outline Riemnn Sums Riemnn Integrls Properties Abstrct

More information

Recitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications

Recitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications Mth 1c TA: Pdric Brtlett Recittion 3: Applictions of the Derivtive Week 3 Cltech 013 1 Higher-Order Derivtives nd their Applictions Another thing we could wnt to do with the derivtive, motivted by wht

More information

Acceptance Sampling by Attributes

Acceptance Sampling by Attributes Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire

More information

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17 EECS 70 Discrete Mthemtics nd Proility Theory Spring 2013 Annt Shi Lecture 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion,

More information

Extended nonlocal games from quantum-classical games

Extended nonlocal games from quantum-classical games Extended nonlocl gmes from quntum-clssicl gmes Theory Seminr incent Russo niversity of Wterloo October 17, 2016 Outline Extended nonlocl gmes nd quntum-clssicl gmes Entngled vlues nd the dimension of entnglement

More information

Tutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits.

Tutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits. Tutoril 4 Exercises on Differentil Entropy. Evlute the differentil entropy h(x) f ln f for the following: () The uniform distribution, f(x) b. (b) The exponentil density, f(x) λe λx, x 0. (c) The Lplce

More information

N 0 completions on partial matrices

N 0 completions on partial matrices N 0 completions on prtil mtrices C. Jordán C. Mendes Arújo Jun R. Torregros Instituto de Mtemátic Multidisciplinr / Centro de Mtemátic Universidd Politécnic de Vlenci / Universidde do Minho Cmino de Ver

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations ME 3600 Control ystems Chrcteristics of Open-Loop nd Closed-Loop ystems Importnt Control ystem Chrcteristics o ensitivity of system response to prmetric vritions cn be reduced o rnsient nd stedy-stte responses

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

An approximation to the arithmetic-geometric mean. G.J.O. Jameson, Math. Gazette 98 (2014), 85 95

An approximation to the arithmetic-geometric mean. G.J.O. Jameson, Math. Gazette 98 (2014), 85 95 An pproximtion to the rithmetic-geometric men G.J.O. Jmeson, Mth. Gzette 98 (4), 85 95 Given positive numbers > b, consider the itertion given by =, b = b nd n+ = ( n + b n ), b n+ = ( n b n ) /. At ech

More information

Generation of Lyapunov Functions by Neural Networks

Generation of Lyapunov Functions by Neural Networks WCE 28, July 2-4, 28, London, U.K. Genertion of Lypunov Functions by Neurl Networks Nvid Noroozi, Pknoosh Krimghee, Ftemeh Sfei, nd Hmed Jvdi Abstrct Lypunov function is generlly obtined bsed on tril nd

More information

DIRECT CURRENT CIRCUITS

DIRECT CURRENT CIRCUITS DRECT CURRENT CUTS ELECTRC POWER Consider the circuit shown in the Figure where bttery is connected to resistor R. A positive chrge dq will gin potentil energy s it moves from point to point b through

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificil Intelligence Lecture 19: Decision Digrms Pieter Abbeel --- C Berkeley Mny slides over this course dpted from Dn Klein, Sturt Russell, Andrew Moore Decision Networks ME: choose the ction

More information

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar)

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar) Lecture 3 (5.3.2018) (trnslted nd slightly dpted from lecture notes by Mrtin Klzr) Riemnn integrl Now we define precisely the concept of the re, in prticulr, the re of figure U(, b, f) under the grph of

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

The practical version

The practical version Roerto s Notes on Integrl Clculus Chpter 4: Definite integrls nd the FTC Section 7 The Fundmentl Theorem of Clculus: The prcticl version Wht you need to know lredy: The theoreticl version of the FTC. Wht

More information

1 Structural induction, finite automata, regular expressions

1 Structural induction, finite automata, regular expressions Discrete Structures Prelim 2 smple uestions s CS2800 Questions selected for spring 2017 1 Structurl induction, finite utomt, regulr expressions 1. We define set S of functions from Z to Z inductively s

More information

8 Laplace s Method and Local Limit Theorems

8 Laplace s Method and Local Limit Theorems 8 Lplce s Method nd Locl Limit Theorems 8. Fourier Anlysis in Higher DImensions Most of the theorems of Fourier nlysis tht we hve proved hve nturl generliztions to higher dimensions, nd these cn be proved

More information

1 Probability Density Functions

1 Probability Density Functions Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1 Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution

More information

different methods (left endpoint, right endpoint, midpoint, trapezoid, Simpson s).

different methods (left endpoint, right endpoint, midpoint, trapezoid, Simpson s). Mth 1A with Professor Stnkov Worksheet, Discussion #41; Wednesdy, 12/6/217 GSI nme: Roy Zho Problems 1. Write the integrl 3 dx s limit of Riemnn sums. Write it using 2 intervls using the 1 x different

More information