Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments
|
|
- Jane McCoy
- 6 years ago
- Views:
Transcription
1 Plnning to Be Surprised: Optiml Byesin Explortion in Dynmic Environments Yi Sun, Fustino Gomez, nd Jürgen Schmidhuber IDSIA, Glleri 2, Mnno, CH-6928, Switzerlnd Abstrct. To mximize its success, n AGI typiclly needs to explore its initilly unknown world. Is there n optiml wy of doing so? Here we derive n ffirmtive nswer for brod clss of environments. 1 Introduction An intelligent gent is sent to explore n unknown environment. Over the course of its mission, the gent mkes observtions, crries out ctions, nd incrementlly builds up model of the environment from this interction. Since the wy in which the gent selects ctions my gretly ffect the efficiency of the explortion, the following question nturlly rises: How should the gent choose the ctions such tht the knowledge bout the environment ccumultes s quickly s possible? In this pper, this question is ddressed under clssicl frmework in which the gent improves its model of the environment through probbilistic inference, nd lerning progress is mesured in terms of Shnnon informtion gin. We show tht the gent cn, t lest in principle, optimlly choose ctions bsed on previous experiences, such tht the cumultive expected informtion gin is mximized. The rest of the pper is orgnized s follows: Section 2 reviews the bsic concepts nd estblishes the terminology; Section 3 elbortes the principle of optiml Byesin explortion; Section 4 presents simple experiment; Relted work is briefly reviewed in Section 5; Section 6 concludes the pper. 2 Preliminries Suppose tht the gent intercts with the environment in discrete time cycles t = 1, 2,.... In ech cycle, the gent performs n ction,, then receives sensory input, o. A history, h, is either the empty string,, or string of the form 1 o 1 t o t for some t, nd h nd ho refer to the strings resulting from ppending nd o to h, respectively. 2.1 Lerning from Sequentil Interctions To fcilitte the subsequent discussion under probbilistic frmework, we mke the following ssumptions:
2 Assumption I. The models of the environment under considertion re fully described by rndom element Θ which depends solely on the environment. Moreover, the gent s initil knowledge bout Θ is summrized by prior density p (θ). Assumption II. The gent is equipped with conditionl predictor p (o h; θ), i.e. the gent is cpble of refining its prediction in the light of informtion bout Θ. Using p (θ) nd p (o h; θ) s building blocks, it is strightforwrd to formulte lerning in terms of probbilistic inference. From Assumption I, given the history h, the gent s knowledge bout Θ is fully summrized by p (θ h). According to Byes rule, p (θ ho) = p(θ h)p(o h;θ) p(o h), with p (o h) = p (o h, θ) p (θ h) dθ. The term p (θ h) represents the gent s current knowledge bout Θ given history h nd n dditionl ction. Since Θ depends solely on the environment, nd, importntly, knowing the ction without subsequent observtions cnnot chnge the gent s stte of knowledge bout Θ, then p (θ h) = p (θ h), nd thus the knowledge bout Θ cn be updted using p (θ ho) = p (θ h) p (o h; θ) p (o h). (1) It is worth pointing out tht p (o h; θ) is chosen before entering the environment. It is not required tht it mtch the true dynmics of the environment, but the effectiveness of the lerning certinly depends on the choices of p (o h; θ). For exmple, if Θ R, nd p (o h; θ) depends on θ only through its sign, then no knowledge other thn the sign of Θ cn be lerned. 2.2 Informtion Gin s Lerning Progress Let h nd h be two histories such tht h is prefix of h. The respective posterior distributions of Θ re p (θ h) nd p (θ h ). Using h s reference point, the mount of informtion gined when the history grows to h cn be mesured using the KL divergence between p (θ h) nd p (θ h ). This informtion gin from h to h is defined s g(h h) = KL (p (θ h ) p (θ h)) = p (θ h ) log p (θ h ) p (θ h) dθ. As specil cse, if h =, then g (h ) = g (h ) is the cumultive informtion gin with respect to the prior p (θ). We lso write g (o h) for g (ho h), which denotes the informtion gined from n dditionl ction-observtion pir. From n informtion theoretic point of view, the KL divergence between two distributions p nd q represents the dditionl number of bits required to encode elements smpled from p, using optiml coding strtegy designed for q. This cn be interpreted s the degree of unexpectedness or surprise cused by observing smples from p when expecting smples from q. The key property informtion gin for the tretment below is the following decomposition: Let h be prefix of h nd h be prefix of h, then E h h g (h h) = g (h h) + E h h g (h h ). (2)
3 Tht is, the informtion gin is dditive in expecttion. Hving defined the informtion gin from trjectories ending with observtions, one my proceed to define the expected informtion gin of performing ction, before observing the outcome o. Formlly, the expected informtion gin of performing with respect to the current history h is given by ḡ ( h) = E o h g (o h). A simple derivtion gives ḡ ( h) = o p (o, θ h) log p (o, θ h) dθ = I (O; Θ h), p (θ h) p (o h) which mens tht ḡ ( h) is the mutul informtion between Θ nd the rndom vrible O representing the unknown observtion, conditioned on the history h nd ction. 3 Optiml Byesin Explortion In this section, the generl principle of optiml Byesin explortion in dynmic environments is presented. We first give results obtined by ssuming fixed limited life spn for our gent, then discuss condition required to extend this to infinite time horizons. 3.1 Results for Finite Time Horizon Suppose tht the gent hs experienced history h, nd is bout to choose τ more ctions in the future. Let π be policy mpping the set of histories to the set of ctions, such tht the gent performs with probbility π ( h) given h. Define the curiosity Q-vlue q τ π (h, ) s the expected informtion gined from the dditionl τ ctions, ssuming tht the gent performs in the next step nd follows policy π in the remining τ 1 steps. Formlly, for τ = 1, nd for τ > 1, q 1 π (h, ) = E o h g (o h) = ḡ ( h), q τ π (h, ) = E o h E 1 hoe o1 ho 1 E oτ 1 h τ 1 g (ho 1 o 1 τ 1 o τ 1 h) = E o h E 1o 1 τ 1o τ 1 hog (ho 1 o 1 τ 1 o τ 1 h). The curiosity Q-vlue cn be defined recursively. Applying Eq. 2 for τ = 2, And for τ > 2, q 2 π (h, ) = E o h E 1o 1 hog (ho 1 o 1 h) = E o h [ g (o h) + E1o 1 hog ( 1 o 1 ho) ] = ḡ ( h) + E o h E hoq 1 π (ho, ). q τ π (h, ) = E o h E 1o 1 τ 1o τ 1 hog (ho 1 o 1 τ 1 o τ 1 h) = E o h [ g (o h) + E1o 1 τ 1o τ 1 g (ho 1 o 1 τ 1 o τ 1 ho) ] = ḡ ( h) + E o h E hoq τ 1 π (ho, ). (3)
4 Noting tht Eq.3 bers gret resemblnce to the definition of stte-ction vlues (Q(s, )) in reinforcement lerning, one cn similrly define the curiosity vlue of prticulr history s v τ π (h) = E h q τ π (h, ), nlogous to stte vlues (V (s)), which cn lso be itertively defined s v 1 π (h) = E h ḡ ( h), nd v τ π (h) = E h ( h) + Eo h v τ 1 π (ho) ]. The curiosity vlue v τ π (h) is the expected informtion gin of performing the dditionl τ steps, ssuming tht the gent follows policy π. The two nottions cn be combined to write q τ π (h, ) = ḡ ( h) + E o h v τ 1 π (ho). (4) This eqution hs n interesting interprettion: since the gent is operting in dynmic environment, it hs to tke into ccount not only the immedite expected informtion gin of performing the current ction, i.e., ḡ ( h), but lso the expected curiosity vlue of the sitution in which the gent ends up due to the ction, i.e., vπ τ 1 (ho). As consequence, the gent needs to choose ctions tht blnce the two fctors in order to improve its totl expected informtion gin. Now we show tht there is optiml policy π, which leds to the mximum cumultive expected informtion gin given ny history h. To obtin the optiml policy, one my work bckwrds in τ, tking greedy ctions with respect to the curiosity Q-vlues t ech time step. Nmely, for τ = 1, let q 1 (h, ) = ḡ ( h), π 1 (h) = rg mx ḡ ( h), nd v 1 (h) = mx ḡ ( h), such tht v 1 (h) = q ( 1 h, π 1 (h) ), nd for τ > 1, let ] q τ (h, ) = ḡ ( h) + E o h [mx q τ 1 ( ho) = ḡ ( h) + E o h v τ 1 (ho), with π τ (h) = rg mx q τ (h, ) nd v τ (h) = mx q τ (h, ). We show tht π τ (h) is indeed the optiml policy for ny given τ nd h in the sense tht the curiosity vlue, when following π τ, is mximized. To see this, tke ny other strtegy π, first notice tht v 1 (h) = mx ḡ ( h) E h ḡ ( h) = vπ 1 (h). Moreover, ssuming v τ (h) v τ π (h), v τ+1 (h) = mx ( h) + Eo h v τ (ho) ] mx E h ( h) + Eo h v τ π (ho) ] = v τ+1 π (h). ( h) + Eo h v τ π (ho) ] Therefore v τ (h) vπ τ (h) holds for rbitrry τ, h, nd π. The sme cn be shown for curiosity Q-vlues, nmely, q τ (h, ) qπ τ (h, ), for ll τ, h,, nd π. Now consider tht the gent hs fixed life spn T. It cn be seen tht t time t, the gent hs to perform π T t (h t 1 ) to mximize the expected informtion gin in the remining T t steps. Here h t 1 = 1 o 1 t 1 o t 1 is the history t time t. However, from Eq.2, E ht h t 1 g (h T ) = g (h t 1 ) + E ht h t 1 g (h T h t 1 ).
5 Note tht t time t, g (h t 1 ) is constnt, thus mximizing the cumultive expected informtion gin in the remining time steps is equivlent to mximizing the expected informtion gin of the whole trjectory with respect to the prior. The result is summrized in the following proposition: Proposition 1. Let q 1 (h, ) = ḡ ( h), v 1 (h) = mx q 1 (h, ), nd q τ (h, ) = ḡ ( h) + E o h v τ 1 (ho), v τ (h) = mx q τ (h, ), then the policy π τ (h) = rg mx q τ (h, ) is optiml in the sense tht v τ (h) vπ τ (h), q τ (h, ) qπ τ (h, ) for ny π, τ, h nd. In prticulr, for n gent with fixed life spn T, following π T t (h t 1 ) t time t = 1,..., T is optiml in the sense tht the expected cumultive informtion gin with respect to the prior is mximized. The definition of the optiml explortion policy is constructive, which mens tht it cn be redily implemented, provided tht the number of ctions nd possible observtions is finite so tht the expecttion nd mximiztion cn be computed exctly. However, the cost of computing such policy is O ((n o n ) τ ), where n o nd n re the number of possible observtions nd ctions, respectively. Since the cost is exponentil on τ, plnning with lrge number of look hed steps is infesible, nd pproximtion heuristics must be used in prctice. 3.2 Non-trivility of the Result Intuitively, the recursive definition of the curiosity (Q) vlue is simple, nd bers cler resemblnce to its counterprt in reinforcement lerning. It might be tempting to think tht the result is nothing more thn solving the finite horizon reinforcement lerning problem using ḡ ( h) or g (o h) s the rewrd signls. However, this is not the cse. First, note tht the decomposition Eq.2 is direct consequence of the formultion of the KL divergence. The decomposition does not necessrily hold if g (h) is replced with other types of mesures of informtion gin. Second, it is worth pointing out tht g (o h) nd ḡ ( h) behve differently from norml rewrd signls in the sense tht they re dditive only in expecttion, while in the reinforcement lerning setup, the rewrd signls re usully ssumed to be dditive, i.e., dding rewrd signls together is lwys meningful. Consider simple problem with only two ctions. If g (o h) is plin rewrd function, then g (o h)+g ( o ho) should be meningful, no mtter if nd o is known or not. But this is not the cse, since the sum does not hve vlid informtion theoretic interprettion. On the other hnd, the sum is meningful in expecttion. Nmely, when o hs not been observed, from Eq.2, g (o h) + E o ho g ( o ho) = E o ho g (o o h), the sum cn be interpreted s the expecttion of the informtion gined from h to ho o. This result shows tht g (o h) nd ḡ ( h) cn be treted s dditive rewrd signls only when one is plnning hed. To emphsize the difference further, note tht ll immedite informtion gins g (o h) re non-negtive since they re essentilly KL divergence. A nturl ssumption would be tht the informtion gin g (h), which is the sum of
6 14 12 Sum of immedite informtion gin Cumultive informtion gin w.r.t. prior 10 KL divergence number of smples Fig. 1. Illustrtion of the difference between the sum of one-step informtion gin nd the cumultive informtion gin with respect to the prior. In this cse, 1000 independent smples re generted from distribution over finite smple spce {1, 2, 3}, with p (x = 1) = 0.1, p (x = 2) = 0.5, nd p (x = 3) = 0.4. The tsk of lerning is to recover the mss function from the smples, ssuming Dirichlet prior Dir ` 50, 50, The KL divergence between two Dirichlet distributions re computed ccording to [5]. It is cler from the grph tht the cumultive informtion gin fluctutes when the number of smples increses, while the sum of the one-step informtion gin increses monotoniclly. It lso shows tht the difference between the two quntities cn be lrge. ll g (o h) in expecttion, grows monotoniclly when the length of the history increses. However, this is not the cse, see Figure 1 for exmple. Although g (o h) is lwys non-negtive, some of the gin my pull θ closer to its prior density p (θ), resulting in decrese of KL divergence between p (θ h) nd p (θ). This is never the cse if one considers the norml rewrd signls in reinforcement lerning, where the ccumulted rewrd would never decrese if ll rewrds re non-negtive. 3.3 Extending to Infinite Horizon Hving to restrict the mximum life spn of the gent is rther inconvenient. It is tempting to define the curiosity Q-vlue in the infinite time horizon cse s the limit of curiosity Q-vlues with incresing life spns, T. However, this cnnot be chieved without dditionl technicl constrints. For exmple, consider simple coin tossing. Assuming Bet (1, 1) over the probbility of seeing heds, then the expected cumultive informtion gin for the next T flips is given by v T (h 1 ) = I (Θ; X 1,..., X T ) log T. With incresing T, v T (h 1 ). A frequently used pproch to simplifying the mth is to introduce discount fctor γ (0 γ < 1), s used in reinforcement lerning. Assume tht the gent hs mximum τ ctions left, but before finishing the τ ctions it my be forced to leve the environment with probbility 1 γ t ech time step. In this cse, the curiosity Q-vlue becomes
7 qπ γ,1 (h, ) = ḡ ( h), nd q γ,τ π (h, ) = (1 γ) ḡ ( h) + γ [ ḡ ( h) + E o h E hoq γ,τ 1 π (ho, ) ] = ḡ ( h) + γe o h E hoq γ,τ 1 π (ho, ). One my lso interpret qπ γ,τ (h, ) s liner combintion of curiosity Q-vlues without the discount, q γ,τ π (h, ) = (1 γ) τ γ t 1 qπ t (h, ) + γ τ qπ τ (h, ). t=1 Note tht curiosity Q-vlues with lrger look-hed steps re weighed exponentilly less. The optiml policy in the discounted cse is given by nd q γ,1 (h, ) = ḡ ( h), v γ,1 (h) = mx q γ,1 (h, ), q γ,τ (h, ) = ḡ ( h) + γe o h v γ,τ 1 (ho), v γ,τ (h) = mx q γ,τ (h, ). The optiml ctions re given by π γ,τ (h) = rg mx q γ,τ (h, ). The proof tht is optiml is similr to the one for the finite horizon cse (section 3.1) nd π γ,τ thus is omitted here. Adding the discount enbles one to define the curiosity Q-vlue in infinite time horizon in number of cses. However, it is still possible to construct scenrios where such discount fils. Consider infinite list of bndits. For bndit n, there re n possible outcomes with Dirichlet prior Dir ( 1 n,..., n) 1. The expected informtion gin of pulling bndit n for the first time is then given by ( log n ψ (2) + log ) log n, n with ψ( ) being the digmm function. Assume t time t, only the first e e2t bndits re vilble, thus the curiosity Q-vlue in finite time horizon is lwys finite. However, since the lrgest expected informtion gin grows t speed e t2, for ny given γ > 0, q γ,τ goes to infinity with incresing τ. This exmple gives the intuition tht to mke the curiosity Q-vlue meningful, the totl informtion content of the environment (or its growing speed) must be bounded. The following technicl Lemm gives sufficient condition for when such extension is meningful. Lemm 1. We hve 0 q γ,τ+1 (h, ) q γ,τ (h, ) γ τ E o h mx E o1 ho 1 mx ḡ ( τ h o τ 1 ). 1 τ
8 Proof. Expnd q γ,τ nd q γ,τ+1, nd note tht mx X mx Y mx X Y, then q γ,τ+1 π (h, ) qπ γ,τ (h, ) = E o h mx 1 E o1 ho 1 mx τ ( h) + γḡ ( 1 ho) + + γ τ ḡ ( τ h o τ 1 )] E o h mx E o1 ho 1 mx ( h) + γḡ (1 ho) + + γ τ 1 ḡ ( τ 1 h o τ 2 ) ] 1 τ 1 E o h mx{e o1 ho 1 mx ( h) + γḡ ( 1 ho) + + γ τ ḡ ( τ h o τ 1 )] 1 τ E o1 ho 1 mx τ 1 ( h) + γḡ (1 ho) + + γ τ 1 ḡ ( τ 1 h o τ 2 ) ] } γ τ E o h mx E o1 ho 1 mx ḡ ( τ h o τ 1 ). 1 τ It cn be seen tht if E o1 o τ 1 τ hḡ ( τ h o τ 1 ) grows sub-exponentilly, then qπ γ,τ is Cuchy sequence, nd it mkes sense to define the curiosity Q-vlue for infinite time horizon. 4 Experiment The ide presented in the previous section is illustrted through simple experiment. The environment is n MDP consisting of two groups of densely connected sttes (cliques) linked by long corridor. The gent hs two ctions llowing it to move long the corridor deterministiclly, wheres the trnsition probbilities inside ech clique re rndomly generted. The gent ssumes Dirichlet priors over ll trnsition probbilities, nd the gol is to lern the trnsition model of the MDP. In the experiment, ech clique consists of 5 sttes, (sttes 1 to 5 nd sttes 56 to 60), nd the corridor is of length 50 (sttes 6 to 55). The prior over ech trnsition probbility is Dir ( 1 60,..., 60) 1. We compre four different lgorithms: i) rndom explortion, where the gent selects ech of the two ctions with equl probbility t ech time step; ii) Q- lerning with the immedite informtion gin g (o h) s the rewrd; iii) greedy explortion, where the gent chooses t ech time step the ction mximizing ḡ ( h); nd iv) dynmic-progrmming (DP) pproximtion of the optiml Byesin explortion, where t ech time step the gent follows policy which is computed using policy itertion, ssuming tht the dynmics of the MDP is given by the current posterior, nd the rewrd is the expected informtion gin ḡ ( h). The detil of this lgorithm is described in [11]. Fig.2 shows the typicl behvior of the four lgorithms. The upper four plots show how the gent moves in the MDP strting from one clique. Both greedy explortion nd DP move bck nd forth between the two cliques. Rndom explortion hs difficulty moving between the two cliques due to the rndom wlk behvior in the corridor. Q-lerning explortion, however, gets stuck in the initil clique. The reson for is tht since the jump on the corridor is deterministic, the informtion gin decreses to virtully zero fter only severl ttempts, therefore the Q-vlue of jumping into the corridor becomes much lower thn the
9 Rndom stte stte Q lerning Greedy stte stte DP cum. info. gin Rndom Q lerning Greedy DP Time Fig. 2. The explortion process of typicl run of 4000 steps. The upper four plots shows the position of the gent between stte 1 (the lowest) nd 60 (the highest). The sttes t the top nd the bottom correspond to the two cliques, nd the sttes in the middle correspond to the corridor. The lowest plot is the cumultive informtion gin with respect to the prior. Q-vlue of jumping inside the clique. The bottom plot shows how the cumultive informtion gin grows over time, nd how the DP pproximtion clerly outperforms the other lgorithms, prticulrly in the erly phse of explortion. 5 Relted Work The ide of ctively selecting queries to ccelerte lerning process hs long history [1, 2, 7], nd hs received lot of ttention in recent decdes, primrily in the context of ctive lerning [8] nd rtificil curiosity [6]. In prticulr, mesuring lerning progress using KL divergence dtes bck to the 50 s [2, 4]. In
10 1995 this ws combined with reinforcement lerning, with the gol of optimizing future expected informtion gin [10]. Others renmed this Byesin surprise [3]. Our work differs from most previous work in two min points: First, like in [10], we consider the problem of exploring dynmic environment, where ctions chnge the environmentl stte, while most work on ctive lerning nd Byesin experiment design focuses on queries tht do not ffect the environment [8]. Second, our result is theoreticlly sound nd directly derived from first principles, in contrst to the more heuristic ppliction [10] of trditionl reinforcement lerning to mximize the expected informtion gin. In prticulr, we pointed out previously neglected subtlety of using KL divergence s lerning progress. Conceptully, however, this work is closely connected to rtificil curiosity nd intrinsiclly motivted reinforcement lerning [6,7,9] for gents tht ctively explore the environment without n externl rewrd signl. In fct, the very definition of the curiosity (Q) vlue permits firm connection between pure explortion nd reinforcement lerning. 6 Conclusion We hve presented the principle of optiml Byesin explortion in dynmic environments, centered round the concept of the curiosity (Q) vlue. Our work provides theoreticlly sound foundtion for designing more effective explortion strtegies. Future work will concentrte on studying the theoreticl properties of vrious pproximtion strtegies inspired by this principle. 7 Acknowledgement This reserch ws funded in prt by Swiss Ntionl Science Foundtion grnt /1, nd the EU IM-CLeVeR project (#231722). References 1. Chloner, K., Verdinelli, I.: Byesin experimentl design: A review. Sttisticl Science 10, (1995) 2. Fedorov, V.V.: Theory of optiml experiments. Acdemic Press (1972) 3. Itti, L., Bldi, P.F.: Byesin surprise ttrcts humn ttention. In: NIPS 05. pp (2006) 4. Lindley, D.V.: On mesure of the informtion provided by n experiment. Annls of Mthemticl Sttistics 27(4), (1956) 5. Penny, W.: Kullbck-liebler divergences of norml, gmm, dirichlet nd wishrt densities. Tech. rep., Wellcome Deprtment of Cognitive Neurology, University College London (2001) 6. Schmidhuber, J.: Curious model-building control systems. In: IJCNN 91. vol. 2, pp (1991) 7. Schmidhuber, J.: Forml theory of cretivity, fun, nd intrinsic motivtion ( ). Autonomous Mentl Development, IEEE Trns. on Autonomous Mentl Development 2(3), (9 2010) 8. Settles, B.: Active lerning literture survey. Tech. rep., University of Wisconsin Mdison (2010) 9. Singh, S., Brto, A., Chentnez, N.: Intrinsiclly motivted reinforcement lerning. In: NIPS 04 (2004) 10. Storck, J., Hochreiter, S., Schmidhuber, J.: Reinforcement driven informtion cquisition in non-deterministic environments. In: ICANN 95 (1995) 11. Sun, Y., Gomez, F.J., Schmidhuber, J.: Plnning to be surprised: Optiml byesin explortion in dynmic environments (2011),
Reinforcement Learning
Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm
More informationReinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More information2D1431 Machine Learning Lab 3: Reinforcement Learning
2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed
More informationCS 188 Introduction to Artificial Intelligence Fall 2018 Note 7
CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees
More information19 Optimal behavior: Game theory
Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,
More information1 Online Learning and Regret Minimization
2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in
More informationAdministrivia CSE 190: Reinforcement Learning: An Introduction
Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these
More informationRecitation 3: More Applications of the Derivative
Mth 1c TA: Pdric Brtlett Recittion 3: More Applictions of the Derivtive Week 3 Cltech 2012 1 Rndom Question Question 1 A grph consists of the following: A set V of vertices. A set E of edges where ech
More informationMulti-Armed Bandits: Non-adaptive and Adaptive Sampling
CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More informationp-adic Egyptian Fractions
p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction
More informationTHE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.
THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem
More informationProperties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives
Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn
More informationSolution for Assignment 1 : Intro to Probability and Statistics, PAC learning
Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (
More informationMath 1B, lecture 4: Error bounds for numerical methods
Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the
More informationReversals of Signal-Posterior Monotonicity for Any Bounded Prior
Reversls of Signl-Posterior Monotonicity for Any Bounded Prior Christopher P. Chmbers Pul J. Hely Abstrct Pul Milgrom (The Bell Journl of Economics, 12(2): 380 391) showed tht if the strict monotone likelihood
More informationNon-Linear & Logistic Regression
Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University) Regression Anlyses When do we use these? PART 1: find
More informationThe Regulated and Riemann Integrals
Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue
More informationMIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model:
1 2 MIXED MODELS (Sections 17.7 17.8) Exmple: Suppose tht in the fiber breking strength exmple, the four mchines used were the only ones of interest, but the interest ws over wide rnge of opertors, nd
More informationThe First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).
The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples
More informationTHE EXISTENCE OF NEGATIVE MCMENTS OF CONTINUOUS DISTRIBUTIONS WALTER W. PIEGORSCH AND GEORGE CASELLA. Biometrics Unit, Cornell University, Ithaca, NY
THE EXISTENCE OF NEGATIVE MCMENTS OF CONTINUOUS DISTRIBUTIONS WALTER W. PIEGORSCH AND GEORGE CASELLA. Biometrics Unit, Cornell University, Ithc, NY 14853 BU-771-M * December 1982 Abstrct The question of
More informationRiemann is the Mann! (But Lebesgue may besgue to differ.)
Riemnn is the Mnn! (But Lebesgue my besgue to differ.) Leo Livshits My 2, 2008 1 For finite intervls in R We hve seen in clss tht every continuous function f : [, b] R hs the property tht for every ɛ >
More informationImproper Integrals. Type I Improper Integrals How do we evaluate an integral such as
Improper Integrls Two different types of integrls cn qulify s improper. The first type of improper integrl (which we will refer to s Type I) involves evluting n integrl over n infinite region. In the grph
More informationMath 8 Winter 2015 Applications of Integration
Mth 8 Winter 205 Applictions of Integrtion Here re few importnt pplictions of integrtion. The pplictions you my see on n exm in this course include only the Net Chnge Theorem (which is relly just the Fundmentl
More informationChapter 5 : Continuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll
More informationUNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3
UNIFORM CONVERGENCE Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 Suppose f n : Ω R or f n : Ω C is sequence of rel or complex functions, nd f n f s n in some sense. Furthermore,
More informationf(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral
Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one
More informationChapter 0. What is the Lebesgue integral about?
Chpter 0. Wht is the Lebesgue integrl bout? The pln is to hve tutoril sheet ech week, most often on Fridy, (to be done during the clss) where you will try to get used to the ides introduced in the previous
More information{ } = E! & $ " k r t +k +1
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationAdvanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004
Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when
More informationChapter 4: Dynamic Programming
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationImproper Integrals, and Differential Equations
Improper Integrls, nd Differentil Equtions October 22, 204 5.3 Improper Integrls Previously, we discussed how integrls correspond to res. More specificlly, we sid tht for function f(x), the region creted
More informationDuality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.
Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we
More informationCS 188: Artificial Intelligence Spring 2007
CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment
More informationMapping the delta function and other Radon measures
Mpping the delt function nd other Rdon mesures Notes for Mth583A, Fll 2008 November 25, 2008 Rdon mesures Consider continuous function f on the rel line with sclr vlues. It is sid to hve bounded support
More informationContinuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht
More informationNew Expansion and Infinite Series
Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University
More informationReview of Calculus, cont d
Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some
More information7.2 The Definite Integral
7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where
More informationEntropy and Ergodic Theory Notes 10: Large Deviations I
Entropy nd Ergodic Theory Notes 10: Lrge Devitions I 1 A chnge of convention This is our first lecture on pplictions of entropy in probbility theory. In probbility theory, the convention is tht ll logrithms
More informationA REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007
A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H Thoms Shores Deprtment of Mthemtics University of Nebrsk Spring 2007 Contents Rtes of Chnge nd Derivtives 1 Dierentils 4 Are nd Integrls 5 Multivrite Clculus
More informationLECTURE NOTE #12 PROF. ALAN YUILLE
LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of
More informationMAA 4212 Improper Integrals
Notes by Dvid Groisser, Copyright c 1995; revised 2002, 2009, 2014 MAA 4212 Improper Integrls The Riemnn integrl, while perfectly well-defined, is too restrictive for mny purposes; there re functions which
More informationMath 426: Probability Final Exam Practice
Mth 46: Probbility Finl Exm Prctice. Computtionl problems 4. Let T k (n) denote the number of prtitions of the set {,..., n} into k nonempty subsets, where k n. Argue tht T k (n) kt k (n ) + T k (n ) by
More informationUNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction
Lesson : Logrithmic Functions s Inverses Prerequisite Skills This lesson requires the use of the following skills: determining the dependent nd independent vribles in n exponentil function bsed on dt from
More informationEstimation of Binomial Distribution in the Light of Future Data
British Journl of Mthemtics & Computer Science 102: 1-7, 2015, Article no.bjmcs.19191 ISSN: 2231-0851 SCIENCEDOMAIN interntionl www.sciencedomin.org Estimtion of Binomil Distribution in the Light of Future
More information4 The dynamical FRW universe
4 The dynmicl FRW universe 4.1 The Einstein equtions Einstein s equtions G µν = T µν (7) relte the expnsion rte (t) to energy distribution in the universe. On the left hnd side is the Einstein tensor which
More informationDecision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees
CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize
More informationHow do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?
XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk bout solving systems of liner equtions. These re problems tht give couple of equtions with couple of unknowns, like: 6 2 3 7 4
More information5.7 Improper Integrals
458 pplictions of definite integrls 5.7 Improper Integrls In Section 5.4, we computed the work required to lift pylod of mss m from the surfce of moon of mss nd rdius R to height H bove the surfce of the
More informationECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance
Generl structure ECO 37 Economics of Uncertinty Fll Term 007 Notes for lectures 4. Stochstic Dominnce Here we suppose tht the consequences re welth mounts denoted by W, which cn tke on ny vlue between
More informationAPPROXIMATE INTEGRATION
APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be
More informationReview of basic calculus
Review of bsic clculus This brief review reclls some of the most importnt concepts, definitions, nd theorems from bsic clculus. It is not intended to tech bsic clculus from scrtch. If ny of the items below
More information1B40 Practical Skills
B40 Prcticl Skills Comining uncertinties from severl quntities error propgtion We usully encounter situtions where the result of n experiment is given in terms of two (or more) quntities. We then need
More informationLecture 14: Quadrature
Lecture 14: Qudrture This lecture is concerned with the evlution of integrls fx)dx 1) over finite intervl [, b] The integrnd fx) is ssumed to be rel-vlues nd smooth The pproximtion of n integrl by numericl
More informationS. S. Dragomir. 2, we have the inequality. b a
Bull Koren Mth Soc 005 No pp 3 30 SOME COMPANIONS OF OSTROWSKI S INEQUALITY FOR ABSOLUTELY CONTINUOUS FUNCTIONS AND APPLICATIONS S S Drgomir Abstrct Compnions of Ostrowski s integrl ineulity for bsolutely
More informationInfinite Geometric Series
Infinite Geometric Series Finite Geometric Series ( finite SUM) Let 0 < r < 1, nd let n be positive integer. Consider the finite sum It turns out there is simple lgebric expression tht is equivlent to
More informationCredibility Hypothesis Testing of Fuzzy Triangular Distributions
666663 Journl of Uncertin Systems Vol.9, No., pp.6-74, 5 Online t: www.jus.org.uk Credibility Hypothesis Testing of Fuzzy Tringulr Distributions S. Smpth, B. Rmy Received April 3; Revised 4 April 4 Abstrct
More information1 1D heat and wave equations on a finite interval
1 1D het nd wve equtions on finite intervl In this section we consider generl method of seprtion of vribles nd its pplictions to solving het eqution nd wve eqution on finite intervl ( 1, 2. Since by trnsltion
More informationJack Simons, Henry Eyring Scientist and Professor Chemistry Department University of Utah
1. Born-Oppenheimer pprox.- energy surfces 2. Men-field (Hrtree-Fock) theory- orbitls 3. Pros nd cons of HF- RHF, UHF 4. Beyond HF- why? 5. First, one usully does HF-how? 6. Bsis sets nd nottions 7. MPn,
More informationCMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature
CMDA 4604: Intermedite Topics in Mthemticl Modeling Lecture 19: Interpoltion nd Qudrture In this lecture we mke brief diversion into the res of interpoltion nd qudrture. Given function f C[, b], we sy
More informationBernoulli Numbers Jeff Morton
Bernoulli Numbers Jeff Morton. We re interested in the opertor e t k d k t k, which is to sy k tk. Applying this to some function f E to get e t f d k k tk d k f f + d k k tk dk f, we note tht since f
More informationHow to simulate Turing machines by invertible one-dimensional cellular automata
How to simulte Turing mchines by invertible one-dimensionl cellulr utomt Jen-Christophe Dubcq Déprtement de Mthémtiques et d Informtique, École Normle Supérieure de Lyon, 46, llée d Itlie, 69364 Lyon Cedex
More informationCS 275 Automata and Formal Language Theory
CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)
More informationDecision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information
CS 188: Artificil Intelligence nd Vlue of Informtion Instructors: Dn Klein nd Pieter Abbeel niversity of Cliforni, Berkeley [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI
More informationThe Wave Equation I. MA 436 Kurt Bryan
1 Introduction The Wve Eqution I MA 436 Kurt Bryn Consider string stretching long the x xis, of indeterminte (or even infinite!) length. We wnt to derive n eqution which models the motion of the string
More informationContinuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom
Lerning Gols Continuous Rndom Vriles Clss 5, 8.05 Jeremy Orloff nd Jonthn Bloom. Know the definition of continuous rndom vrile. 2. Know the definition of the proility density function (pdf) nd cumultive
More informationBayesian Networks: Approximate Inference
pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,
More informationCS 188: Artificial Intelligence Fall 2010
CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1 Decision Networks ME: choose the ction which mximizes the expected utility given
More informationLocal orthogonality: a multipartite principle for (quantum) correlations
Locl orthogonlity: multiprtite principle for (quntum) correltions Antonio Acín ICREA Professor t ICFO-Institut de Ciencies Fotoniques, Brcelon Cusl Structure in Quntum Theory, Bensque, Spin, June 2013
More informationRiemann Sums and Riemann Integrals
Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 2013 Outline 1 Riemnn Sums 2 Riemnn Integrls 3 Properties
More information221B Lecture Notes WKB Method
Clssicl Limit B Lecture Notes WKB Method Hmilton Jcobi Eqution We strt from the Schrödinger eqution for single prticle in potentil i h t ψ x, t = [ ] h m + V x ψ x, t. We cn rewrite this eqution by using
More informationCHM Physical Chemistry I Chapter 1 - Supplementary Material
CHM 3410 - Physicl Chemistry I Chpter 1 - Supplementry Mteril For review of some bsic concepts in mth, see Atkins "Mthemticl Bckground 1 (pp 59-6), nd "Mthemticl Bckground " (pp 109-111). 1. Derivtion
More informationArithmetic & Algebra. NCTM National Conference, 2017
NCTM Ntionl Conference, 2017 Arithmetic & Algebr Hether Dlls, UCLA Mthemtics & The Curtis Center Roger Howe, Yle Mthemtics & Texs A & M School of Eduction Relted Common Core Stndrds First instnce of vrible
More informationNumerical integration
2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter
More informationModule 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo
Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:
More informationFrobenius numbers of generalized Fibonacci semigroups
Frobenius numbers of generlized Fiboncci semigroups Gretchen L. Mtthews 1 Deprtment of Mthemticl Sciences, Clemson University, Clemson, SC 29634-0975, USA gmtthe@clemson.edu Received:, Accepted:, Published:
More informationRiemann Sums and Riemann Integrals
Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 203 Outline Riemnn Sums Riemnn Integrls Properties Abstrct
More informationRecitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications
Mth 1c TA: Pdric Brtlett Recittion 3: Applictions of the Derivtive Week 3 Cltech 013 1 Higher-Order Derivtives nd their Applictions Another thing we could wnt to do with the derivtive, motivted by wht
More informationAcceptance Sampling by Attributes
Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire
More informationDiscrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17
EECS 70 Discrete Mthemtics nd Proility Theory Spring 2013 Annt Shi Lecture 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion,
More informationExtended nonlocal games from quantum-classical games
Extended nonlocl gmes from quntum-clssicl gmes Theory Seminr incent Russo niversity of Wterloo October 17, 2016 Outline Extended nonlocl gmes nd quntum-clssicl gmes Entngled vlues nd the dimension of entnglement
More informationTutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits.
Tutoril 4 Exercises on Differentil Entropy. Evlute the differentil entropy h(x) f ln f for the following: () The uniform distribution, f(x) b. (b) The exponentil density, f(x) λe λx, x 0. (c) The Lplce
More informationN 0 completions on partial matrices
N 0 completions on prtil mtrices C. Jordán C. Mendes Arújo Jun R. Torregros Instituto de Mtemátic Multidisciplinr / Centro de Mtemátic Universidd Politécnic de Vlenci / Universidde do Minho Cmino de Ver
More information1.4 Nonregular Languages
74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll
More informationFig. 1. Open-Loop and Closed-Loop Systems with Plant Variations
ME 3600 Control ystems Chrcteristics of Open-Loop nd Closed-Loop ystems Importnt Control ystem Chrcteristics o ensitivity of system response to prmetric vritions cn be reduced o rnsient nd stedy-stte responses
More informationCoalgebra, Lecture 15: Equations for Deterministic Automata
Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined
More informationAn approximation to the arithmetic-geometric mean. G.J.O. Jameson, Math. Gazette 98 (2014), 85 95
An pproximtion to the rithmetic-geometric men G.J.O. Jmeson, Mth. Gzette 98 (4), 85 95 Given positive numbers > b, consider the itertion given by =, b = b nd n+ = ( n + b n ), b n+ = ( n b n ) /. At ech
More informationGeneration of Lyapunov Functions by Neural Networks
WCE 28, July 2-4, 28, London, U.K. Genertion of Lypunov Functions by Neurl Networks Nvid Noroozi, Pknoosh Krimghee, Ftemeh Sfei, nd Hmed Jvdi Abstrct Lypunov function is generlly obtined bsed on tril nd
More informationDIRECT CURRENT CIRCUITS
DRECT CURRENT CUTS ELECTRC POWER Consider the circuit shown in the Figure where bttery is connected to resistor R. A positive chrge dq will gin potentil energy s it moves from point to point b through
More informationCS 188: Artificial Intelligence
CS 188: Artificil Intelligence Lecture 19: Decision Digrms Pieter Abbeel --- C Berkeley Mny slides over this course dpted from Dn Klein, Sturt Russell, Andrew Moore Decision Networks ME: choose the ction
More informationLecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar)
Lecture 3 (5.3.2018) (trnslted nd slightly dpted from lecture notes by Mrtin Klzr) Riemnn integrl Now we define precisely the concept of the re, in prticulr, the re of figure U(, b, f) under the grph of
More informationMore on automata. Michael George. March 24 April 7, 2014
More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose
More informationThe practical version
Roerto s Notes on Integrl Clculus Chpter 4: Definite integrls nd the FTC Section 7 The Fundmentl Theorem of Clculus: The prcticl version Wht you need to know lredy: The theoreticl version of the FTC. Wht
More information1 Structural induction, finite automata, regular expressions
Discrete Structures Prelim 2 smple uestions s CS2800 Questions selected for spring 2017 1 Structurl induction, finite utomt, regulr expressions 1. We define set S of functions from Z to Z inductively s
More information8 Laplace s Method and Local Limit Theorems
8 Lplce s Method nd Locl Limit Theorems 8. Fourier Anlysis in Higher DImensions Most of the theorems of Fourier nlysis tht we hve proved hve nturl generliztions to higher dimensions, nd these cn be proved
More information1 Probability Density Functions
Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our
More informationI1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3
2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is
More informationExam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1
Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution
More informationdifferent methods (left endpoint, right endpoint, midpoint, trapezoid, Simpson s).
Mth 1A with Professor Stnkov Worksheet, Discussion #41; Wednesdy, 12/6/217 GSI nme: Roy Zho Problems 1. Write the integrl 3 dx s limit of Riemnn sums. Write it using 2 intervls using the 1 x different
More information