Temporally-Biased Sampling for Online Model Management

Size: px
Start display at page:

Download "Temporally-Biased Sampling for Online Model Management"

Transcription

1 Temporlly-Bised Smpling for Online Model Mngement Brin Hentschel Hrvrd University Peter J. Hs* University of Msschusetts Yunyun Tin IBM Reserch Almden ABSTRACT To mintin the ccurcy of supervised lerning models in the presence of evolving dt strems, we provide temporlly-bised smpling schemes tht weight recent dt most hevily, with inclusion probbilities for given dt item decying exponentilly over time. We then periodiclly retrin the models on the current smple. This pproch speeds up the trining process reltive to trining on ll of the dt. Moreover, time-bising lets the models dpt to recent chnges in the dt while unlike in sliding-window pproch still keeping some old dt to ensure robustness in the fce of temporry fluctutions nd periodicities in the dt vlues. In ddition, the smpling-bsed pproch llows existing nlytic lgorithms for sttic dt to be pplied to dynmic streming dt essentilly without chnge. We provide nd nlyze both simple smpling scheme (T-TBS) tht probbilisticlly mintins trget smple size nd novel reservoir-bsed scheme () tht is the first to provide both complete control over the decy rte nd gurnteed upper bound on the smple size, while mximizing both expected smple size nd smple-size stbility. The ltter scheme rests on the notion of frctionl smple nd, unlike T-TBS, llows for dt rrivl rtes tht re unknown nd time vrying. nd T-TBS re of independent interest, extending the known set of unequl-probbility smpling schemes. We discuss distributed implementtion strtegies; experiments in Sprk illuminte the performnce nd sclbility of the lgorithms, nd show tht our pproch cn increse mchine lerning robustness in the fce of evolving dt. 1 INTRODUCTION A key chllenge for mchine lerning (ML) is to keep ML models from becoming stle in the presence of evolving dt. In the context of the emerging Internet of Things (IoT), for exmple, the dt comprises dynmiclly chnging sensor strems [26], nd filure to dpt to chnging dt cn led to loss of predictive power. One wy to del with this problem is to re-engineer existing sttic supervised lerning lgorithms to become dptive. Some prmetric lgorithms such s SVM cn indeed be re-engineered so tht the prmeters re time-vrying, but for non-prmetric lgorithms such s knn-bsed clssifiction, it is not t ll cler how re-engineering cn be ccomplished. We therefore consider lterntive pproches in which we periodiclly retrin ML models, llowing sttic ML lgorithms to be used in dynmic settings essentilly s-is. There re severl possible retrining pproches. Retrining on cumultive dt: Periodiclly retrining model on ll of the dt tht hs rrived so fr is clerly infesible becuse of the huge volume of dt involved. Moreover, recent Work performed t IBM Reserch Almden 218 Copyright held by the owner/uthor(s). Published in Proceedings of the 21st Interntionl Conference on Extending Dtbse Technology (EDBT), Mrch 26-29, 218, ISBN on OpenProceedings.org. Distribution of this pper is permitted under the terms of the Cretive Commons license CC-by-nc-nd 4.. dt is swmped by the mssive mount of pst dt, so the retrined model is not sufficiently dptive. Sliding windows: A simple sliding-window pproch would be to, e.g., periodiclly retrin on the dt from the lst two hours. If the dt rrivl rte is high nd there is no bound on memory, then one must del with long retrining times cused by lrge mounts of dt in the window. The simplest wy to bound the window size is to retin the lst n items. Alterntively, one could try to subsmple within the time-bsed window [14]. The fundmentl problem with ll of these bounding pproches is tht old dt is completely forgotten; the problem is especilly severe when the dt rrivl rte is high. This cn undermine the robustness of n ML model in situtions where old ptterns cn ressert themselves. For exmple, singulr event such s holidy, stock mrket drop, or terrorist ttck cn temporrily disrupt norml dt ptterns, which will reestblish themselves once the effect of the event dies down. Periodic dt ptterns cn led to the sme phenomenon. Another exmple, from [27], concerns influencers on Twitter: prolific tweeter might temporrily stop tweeting due to trvel, illness, or some other reson, nd hence be completely forgotten in sliding-window pproch. Indeed, in rel-world Twitter dt, lmost qurter of top influencers were of this type, nd were missed by sliding window pproch. Temporlly bised smpling: An ppeling lterntive is temporlly bised smpling-bsed pproch, i.e., mintining smple tht hevily emphsizes recent dt but lso contins smll mount of older dt, nd periodiclly retrining model on the smple. By using time-bised smple, the retrining costs cn be held to n cceptble level while not scrificing robustness in the presence of recurrent ptterns. This pproch ws proposed in [27] in the setting of grph nlysis lgorithms, nd hs recently been dopted in the McroBse system [3]. The orthogonl problem of choosing when to retrin model is lso n importnt question, nd is relted to, e.g., the literture on concept drift [13]; in this pper we focus on the problem of how to efficiently mintin time-bised smple. In more detil, our time-bised smpling lgorithms ensure tht the ppernce probbility for given dt item i.e., the probbility tht the item ppers in the current smple decys over time t controlled exponentil rte. Specificlly, we ssume tht items rrive in btches (see the next section for more detils), nd our gol is to ensure tht (i) our smple is representtive in tht ll items in given btch re eqully likely to be in the smple, nd (ii) if items i nd j belong to btches tht hve rrived t (wll clock) times t nd t with t t, then for ny time t t our smple S t is such tht Pr[i S t ]/Pr[j S t ] = e λ(t t ). (1) Thus items with given timestmp re smpled uniformly, nd items with different timestmps re hndled in crefully controlled mnner. The criterion in (1) is nturl nd ppeling in pplictions nd, importntly, is interpretble nd understndble to users. As discussed in [27], the vlue of the decy rte λ cn be chosen to meet ppliction-specific criteri. For exmple, by setting λ =.58, round 1% of the dt items from 4 btches Series ISSN: /2/edbt

2 go re included in the current nlysis. As nother exmple, suppose tht, k = 15 btches go, n entity such s person or city ws represented by n = 1 dt items nd we wnt to ensure tht, with probbility q =.1, t lest one of these dt items remins in the current smple. Then we would set λ = k 1 ln ( 1 (1 q) 1/n).77. If trining dt is vilble, λ cn lso be chosen to mximize ccurcy vi cross vlidtion. The exponentil form of the decy function hs been dopted by the mjority of time-bised-smpling pplictions in prctice becuse otherwise one would typiclly need to trck the rrivl time of every dt item both in nd outside of the smple nd decy ech item individully t n updte, which would mke the smpling opertion intolerbly slow. (A forwrd decy" pproch tht voids this difficulty, but with its own costs, hs been proposed in [9]; we pln to investigte forwrd decy in future work.) Exponentil decy functions mke updte opertions fst nd simple. For the cse in which the item-rrivl rte is high, the min issue is to keep the smple size from becoming too lrge. On the other hnd, when the incoming btches become very smll or widely spced, the smple sizes for ll of the time-bised lgorithms tht we discuss (s well s for sliding-window schemes bsed on wll-clock time) cn become smll. This is nturl consequence of treting recent items s more importnt, nd is chrcteristic of ny smpling scheme tht stisfies (1). We emphsize tht s shown in our experiments smller, but crefully time-bised smple typiclly yields greter prediction ccurcy thn smple tht is lrger due to overloding with too much recent dt or too much old dt. I.e., more smple dt is not lwys better. Indeed, with respect to model mngement, this decy property cn be viewed s feture in tht, if the dt strem dries up nd the smple decys to very smll size, then this is signl tht there is not enough new dt to relibly retrin the model, nd tht the current version should be kept for now. It is surprisingly hrd to both enforce (1) nd to bound the smple size. As discussed in detil in Section 7, prior lgorithms tht bound the smple size either cnnot consistently enforce (1) or cnnot hndle wll-clock time. Exmples of the former include lgorithms bsed on the A-Res scheme of Efrimidis nd Spirkis [12], nd Cho s lgorithm [5]. A-Res enforces conditions on the cceptnce probbilities of items; this leds to ppernce probbilities which, unlike (1), re both hrd to compute nd not intuitive. A similr exmple is provided by Cho s lgorithm [5]. In Appendix D of [16] we demonstrte how the lgorithm cn be specilized to the cse of exponentil decy nd modified to hndle btch rrivls. We then show tht the resulting lgorithm fils to enforce (1) either when initilly filling up n empty smple or in the presence of dt tht rrives slowly reltive to the decy rte, nd hence fils if the dt rte fluctutes too much. The second type of lgorithm, due to Aggrwl [1] cn only control ppernce probbilities bsed on the indices of the dt items. For exmple, fter n items rrive, one could require tht, with 95% probbility, the (n k)th item should still be in the smple for some specified k < n. If the dt rrivl rte is constnt, then this might correspond to constrint of the form with 95% probbility dt item tht rrived 1 hours go is still in the smple, which is often more nturl in pplictions. For vrying rrivl rtes, however, it is impossible to enforce the ltter type of constrint, nd lrge btch of rriving dt cn premturely flush out older dt. Thus our new smpling schemes re interesting in their own right, significntly expnding the set of unequl-probbility smpling techniques. T-TBS: We first provide nd nlyze Trgeted-Size Time-Bised Smpling (T-TBS), simple lgorithm tht generlizes the smpling scheme in [27]. T-TBS llows complete control over the decy rte (expressed in wll-clock time) nd probbilisticlly mintins trget smple size. Tht is, the expected nd verge smple sizes converge to the trget nd the probbility of lrge devitions from the trget decreses exponentilly or fster in both the trget size nd the devition size. T-TBS is simple nd highly sclble when pplicble, but only works under the strong restriction tht the men dt rrivl rte is known nd constnt. There re scenrios where T-TBS might be good choice (see Section 3), but mny pplictions hve non-constnt, unknown men rrivl rtes or cnnot tolerte smple overflows. : We then provide novel lgorithm, Reservoir-Bsed Time-Bised Smpling (), tht is the first to simultneously enforce (1) t ll times, provide gurnteed upper bound on the smple size, nd llow unknown, vrying dt rrivl rtes. Gurnteed bounds re desirble becuse they void memory mngement issues ssocited with smple overflows, especilly when lrge numbers of smples re being mintined so tht the probbility of some smple overflowing is high or when smpling is being performed in limited memory setting such s t the edge of the IoT. Also, bounded smples reduce vribility in retrining times nd do not impose upper limits on the incoming dt flow. The ide behind is to dpt the clssic reservoir smpling lgorithm, which bounds the smple size but does not llow time bising. Our pproch rests on the notion of frctionl smple whose nonnegtive size is rel-vlued in n pproprite sense. We show tht, over ll smpling lgorithms hving exponentil decy, mximizes the expected smple size whenever the dt rrivl rte is low nd lso minimizes the smple-size vribility. Distributed implementtion: Both T-TBS nd cn be prllelized. Wheres T-TBS is reltively strightforwrd to implement, n efficient distributed implementtion of is nontrivil. We exploit vrious implementtion strtegies to reduce I/O reltive to other pproches, void unnecessry concurrency control, nd mke decentrlized decisions bout which items to insert into, or delete from, the reservoir. Orgniztion: The rest of the pper is orgnized s follows. In Section 2 we formlly describe our btch-rrivl problem setting nd discuss two prior simple smpling schemes: simple Bernoulli scheme s in [27] nd the clssicl reservoir smpling scheme, modified for btch rrivls. These methods either bound the smple size but do not control the decy rte, or control the decy rte but not the smple size. We next present nd nlyze the T-TBS nd lgorithms in Section 3 nd Section 4. We describe the distributed implementtion in Section 5, nd Section 6 contins experimentl results. We review the relted literture in Section 7 nd conclude in Section 8. 2 SETTING AND PRIOR SCHEMES After introducing our problem setting, we discuss two prior smpling schemes tht provide context for our current work: simple Bernoulli time-bised smpling (B-TBS) with no smple-size control nd the clssicl reservoir smpling lgorithm (with no time bising), modified for btch rrivls (B-RS). Setting: Items rrive in btches B 1, B 2,..., t time points t = 1, 2,..., where ech btch contins or more items. This 11

3 simple integer btch sequence often rises from the discretiztion of time [24, 28]. Specificlly, the continuous time domin is prtitioned into intervls of length, nd the items re observed only t times {k : k =, 1, 2,...}. All items tht rrive in n intervl [ k, (k + 1) ) re treted s if they rrived t time k, i.e., t the strt of the intervl, so tht ll items in btch B i hve time stmp i, or simply time stmp i if time is mesured in units of length. As discussed below, our results cn strightforwrdly be extended to rbitrry rel-vlued btch-rrivl times. Our gol is to generte sequence {S t } t, where S t is smple of the items tht hve rrived t or prior to time t, i.e., smple of the items in U t = S ( t i=1 B i ). Here we llow the initil smple S to strt out nonempty. These smples should be bised towrds recent items so s to enforce (1) for i B t nd j B t while keeping the smple size s close s possible to (nd preferbly never exceeding) specified trget n. Our ssumption tht btches rrive t integer time points cn esily be dropped. In ll of our lgorithms, inclusion probbilities nd, s discussed lter, closely relted item weights re updted t btch rrivl time t with respect to their vlues t the previous time t = t 1 vi multipliction by e λ. To extend our lgorithms to hndle rbitrry successive btch rrivl times t nd t, we simply multiply insted by e λ(t t). Thus our results cn be pplied to rbitrry sequences of rel-vlued btch rrivl times, nd hence to n rbitrry sequences of item rrivls (since btches cn comprise single items). Bernoulli Time-Bised Smpling (B-TBS): In the simplest smpling scheme, t ech time t, we ccept ech incoming item x B t into the smple with probbility 1. At ech subsequent time t > t, we flip coin independently for ech item currently in the smple: n item is retined in the smple with probbility p = e λ nd removed with probbility 1 p. It is strightforwrd to dpt the lgorithm to btch rrivls; see Appendix A of [16], where we show tht Pr[x S t ] = e λ(t t) for x B t, implying (1). This is essentilly the lgorithm used, e.g., in [27] to implement time-bised edge smpling in dynmic grphs. The user, however, cnnot independently control the expected smple size, which is completely determined by λ nd the sizes of the incoming btches. In prticulr, if the btch sizes systemticlly grow over time, then smple size will grow without bound. Arguments in [27] show tht if sup t B t <, then the smple size cn be bounded, but only probbilisticlly. See Remrk 1 below for extensions nd refinements of these results. Btched Reservoir Smpling (B-RS): The clssic reservoir smpling lgorithm cn be modified to hndle btch rrivls; see Appendix B of [16]. Although B-RS gurntees n upper bound on the smple size, it does not support time bising. The lgorithm (Section 4) mintins bounded reservoir s in B-RS while simultneously llowing time-bised smpling. 3 TARGETED-SIZE TBS As first step towrds time-bised smpling with controlled smple size, we describe the simple T-TBS scheme, which improves upon the simple Bernoulli smpling scheme B-TBS by ensuring the inclusion property in (1) while providing probbilistic gurntees on the smple size. We require tht the men btch size equls constnt b tht is both known in dvnce nd lrge enough in tht b n(1 e λ ), where n is the trget smple size nd λ is the decy rte s before. The requirement on b ensures tht, t the trget smple size, items rrive on verge t lest s fst s they decy. Algorithm 1: Trgeted-size TBS (T-TBS) 1 λ: decy fctor ( ) 2 n: trget smple size 3 b: ssumed men btch size such tht b n(1 e λ ) 4 Initilize: S S ; p e λ ; q n(1 e λ )/b 5 for t 1, 2,... do 6 m Binomil( S, p) //simulte S trils 7 S Smple(S, m) //retin m rndom elements 8 k Binomil( B t, q) 9 B t Smple(B t, k) //down-smple new btch 1 S S B t 11 output S The pseudocode is given s Algorithm 1. T-TBS is similr to B-TBS in tht we downsmple by performing coin flip for ech item with retention probbility p. Unlike B-TBS, we downsmple the incoming btches t rte q = n(1 e λ )/b, which ensures tht n becomes the equilibrium smple size. Specificlly, when the smple size equls n, the expected number n(1 e λ ) of current items deleted t n updte equls the expected number qb of inserted new items, which cuses the smple size to drift towrds n. Arguing similrly to Appendix A of [16], we hve for t t 1 nd x B t tht Pr[x S t ] = qe λ(t t), so tht the key reltive ppernce property in (1) holds. For efficiency, the lgorithm exploits the fct tht for k independent trils, ech hving success probbility r, the totl number of successes hs binomil distribution with prmeters k nd r. Thus, in lines 6 nd 8, the lgorithm simultes the coin tosses by directly generting the number of successes m or k which cn be done using stndrd lgorithms [17] nd then retining m or k rndomly chosen items. So the function Binomil(j, r) returns rndom smple from the binomil distribution with j independent trils nd success probbility r per tril, nd the function Smple(A,m) returns uniform rndom smple, without replcement, contining min(m, A ) elements of the set A; note tht the function cll Smple(A, ) returns n empty smple for ny empty or nonempty A. Theorem 3.1 below precisely describes the behvior of the smple size; the proof long with the proofs of most other results in the pper is given in Appendix C of [16]. Denote by B t = B t the (possibly rndom) size of B t for t 1 nd by C t = S t the smple size t time t for t ; ssume tht C is finite deterministic constnt. Define the upper-support rtio for rndom btch size B s r = b /b 1, where b = E[B] nd b is the smllest positive number such tht P[B b ] = 1; set r = if B cn be rbitrrily lrge. For r [1, ), set for ϵ > nd ν + ϵ,r = (1 + ϵ) ln ( (1 + ϵ)/r ) (1 + ϵ r). ν ϵ,r = (1 ϵ) ln ( (1 ϵ)/r ) (1 ϵ r) for ϵ (, 1). Note tht ν + ϵ,r > nd is strictly incresing in ϵ for ϵ > r 1, nd tht ν ϵ,r increses from r 1 ln r to r s ϵ increses from to 1. Write i.o. to denote tht n event occurs infinitely often, i.e., for infinitely mny vlues of t, nd write w.p.1 for with probbility 1. Theorem 3.1. Suppose tht the btch sizes {B t } t 1 re i.i.d with common men b n(1 e λ ), finite vrince, nd upper support rtio r. Then, for ny p = e λ < 1, (i) for ll m, we hve Pr[C t = m i.o.] = 1; (ii) E[C t ] = n + p t (C n) for t > ; 111

4 Smple Size T-TBS λ =.5 φ = btch # () Growing Btch Size Smple Size T-TBS 2 λ = btch # (b) Stble Btch Size (Det.) Smple Size T-TBS 2 λ = btch # (c) Stble Btch Size (.) Figure 1: Trgeted TBS: Smple Size Behvior, λ = decy rte nd ϕ = btch size multiplier. (iii) lim t (1/t) t i= C i = n w.p.1; (iv) if C = n nd r <, then () Pr[C t (1 + ϵ)n] e nν ϵ,r + ( 1 + O(nϵp t ) ) nd (b) Pr[C t (1 ϵ)n] e nν ϵ,r (1 + O ( n(1 ϵ)p t )) for () ϵ, t > nd (b) ϵ (, 1) nd t ln ϵ/lnp. In Appendix C of [16], we ctully prove stronger version of the theorem in which the ssumption in (iv) tht r < is dropped. Thus, from (ii), lim t E[C t ] = n so tht the expected smple size converges to the trget size n s t becomes lrge; indeed, if C = n then the expected smple size equls n for ll t >. By (iii), n even stronger property holds in tht, w.p.1, the verge smple size verged over the first t btch-rrivl times converges to n s t becomes lrge. For typicl btch-size distributions, the ssertions in (iv) imply tht, t ny given time t, the probbility tht the smple size devites from n by more thn 1ϵ% decreses exponentilly with n nd in the cse of positive devition s in (iv)() super-exponentilly in ϵ. However, the ssertion in (i) implies tht ny smple size m, no mtter how lrge, will be exceeded infinitely often w.p.1; indeed, it follows from the proof tht the men times between successive exceednces re not only finite, but re uniformly bounded over time. In summry, the smple size is generlly stble nd close to n on verge, but is subject to infrequent, but unboundedly lrge spikes in the smple size, so tht smple-size control is incomplete. Indeed, when btch sizes fluctute in non-predicble wy, s often hppens in prctice, T-TBS cn brek down; see Figure 1, in which we plot smple sizes for T-TBS nd, for comprison,. The problem is tht the vlue of the men btch size b must be specified in dvnce, so tht the lgorithm cnnot hndle dynmic chnges in b without losing control of either the decy rte or the smple size. In Figure 1(), for exmple, the (deterministic) btch size is initilly fixed nd the lgorithm is tuned to trget smple size of 1, with decy rte of λ =.5. At t = 2, the btch size strts to increse (with B t+1 = ϕb t where ϕ = 1.2), leding to n overflowing smple, wheres mintins constnt smple size. Even in stble btch-size regime with constnt btch sizes (or, more generlly, smll vritions in btch size), cn mintin constnt smple size wheres the smple size under T-TBS fluctutes in ccordnce with Theorem 3.1; see Figure 1(b) for the cse of constnt btch size B t 1 with λ =.1. Lrge vritions in the btch size led to lrge fluctutions in the smple size for T-TBS; in this cse the smple size for is bounded bove by design, but lrge drops in the btch size cn cuse drops in the smple size for both lgorithms; see Figure 1(c) for the cse of λ =.1 nd i.i.d. uniformly distributed btch sizes on [, 2] so tht E[B t ] 1. Similrly, s shown in Figure 1(d), systemticlly decresing btch sizes will cuse the Smple Size T-TBS λ =.1 φ = btch # (d) Decying Btch Size smple size to shrink for both T-TBS nd. Here, λ =.1 nd, s with Figure 1(), the btch size is initilly fixed nd then strts to chnge t time t = 2, with ϕ =.8 in this cse. This experiment nd others, not reported here, with vrying vlues of λ nd ϕ indicte tht is more robust to smple underflows thn T-TBS. Overll, however, T-TBS is of interest becuse, when the men btch size is known nd constnt over time, nd when some smple overflows re tolerble, T-TBS is simple to implement nd prllelize, nd is very fst (see Section 6). For exmple, if the dt comes from periodic polling of set of robust sensors, the dt rrivl rte will be known priori nd will be reltively constnt, except for the occsionl sensor filure, nd hence T-TBS might be pproprite. On the other hnd, if dt is coming from, e.g., socil network, then btch sizes my be hrd to predict. Remrk 1. When q = 1, Theorem 3.1 provides description of smple-size behvior for B-TBS. Under the conditions of the theorem, the expected smple size converges to n = b/(1 e λ ), which illustrtes tht the smple size nd decy rte cnnot be controlled independently. The ctul smple size fluctutes round this vlue, with lrge devitions bove or below being exponentilly or super-exponentilly rre. Thus Theorem 3.1 both complements nd refines the nlysis in [27]. 4 RESERVOIR-BASED TBS Trgeted time-bised smpling (T-TBS) controls the decy rte but only prtilly controls the smple size, wheres btched reservoir smpling (B-RS) bounds the smple size but does not llow time bising. Our new reservoir-bsed time-bised smpling lgorithm () combines the best fetures of both, controlling the decy rte while ensuring tht the smple never overflows nd hs optiml smple size nd stbility properties. Importntly, unlike T-TBS, the lgorithm cn hndle ny sequence of btch sizes. 4.1 The Algorithm To mintin bounded smple, combines the use of reservoir with the notion of item weights. In, the weight of n item initilly equls 1 but then decys t rte λ, i.e., the weight of n item i B t t time t t is w t (i) = e λ(t t). All items rriving t the sme time hve the sme weight, so tht the totl weight of ll items seen up through time t is W t = tj=1 B j e λ(t j), where, s before, B j = B j is the size of the jth btch. genertes sequence of ltent frctionl smples {L t } t such tht (i) the size of ech L t equls the smple weight C t, defined s C t = min(n,w t ), nd (ii) L t contins C t full items nd t most one prtil item. For exmple, ltent smple of size C t = 3.6 contins three full items tht belong to the ctul smple S t with probbility 1 nd one prtil item tht 112

5 Algorithm 2: Reservoir-bsed TBS () 1 λ: decy fctor ( ) 2 n: mximum smple size 3 Initilize: A A ; W C A ; π // A n 4 for t 1, 2,... do 5 if W < n then //hs been unsturted 6 W e λ W //decy current items 7 if W > then 8 (A, π, C) Dsmple ( (A, π, C), W ) 9 A A B t //ccept ll items in B t 1 W W + B t //updte totl weight 11 if W > n then //smple is now sturted //djust for overshoot 12 (A, π, C) Dsmple ( (A, π, W ), n ) 13 else //hs been sturted 14 W e λ W + B t //new totl weight 15 if W n then //still sturted 16 m StochRound( B t n/w ) //replce m A-items with m B t -items 17 A A \ Smple(A, m) Smple(B t, m) 18 else //now unsturted //djust for undershoot 19 (A, π, C) Dsmple ( (A, π, n), W B t ) 2 A A B t //ll btch items re full 21 S getsmple(a, π, C) 22 output S b c d prtil item b c b c d Figure 2: Ltent smple L t (smple weight C t = 3.6) nd possible relized smples. belongs to S t with probbility.6. Thus S t is obtined by including ech full item nd then including the prtil item ccording to its ssocited probbility, so tht C t represents the expected size of S t. E.g., in our exmple, the smple S t will contin either three or four items with respective probbilities.4 nd.6, so tht the expected smple size is 3.6; see Figure 2. Note tht if C t = k for some k {, 1,..., n}, then with probbility 1 the smple contins precisely k items, nd C t is the ctul size of S t, rther thn just the expected size. Since ech C t by definition never exceeds n, no smple S t ever contins more thn n items. More precisely, given set U of items, ltent smple of U with smple weight C is triple L = (A, π,c), where A U is set of C full items nd π U is (possibly empty) set contining t most one prtil item. At ech time t, we rndomly generte S t from L t = (A t, π t,c t ) by smpling such tht { A t π with probbility frc(c t ); S t = (2) A t with probbility 1 frc(c t ), where frc(x) = x x. Tht is, ech full item is included with probbility 1 nd the prtil item is included with probbility frc(c t ). Thus E[ S t ] = C t frc(c t ) + C t ( 1 frc(c t ) ) = ( C t C t ) frc(c t ) + C t = frc(c t ) + C t = C t (3) s previously sserted. By llowing t most one prtil item, we minimize the ltent smple s footprint: A t π t C t + 1. The key gol of is to mintin the invrint Pr[i S t ] = ( C t /W t ) wt (i) (4) for ech t nd ech item i U t, where, s before, U t denotes the set of ll items tht rrive up through time t, so tht the ppernce probbility for n item i t time t is proportionl to its weight w t (i). This immeditely implies the desired reltiveinclusion property (1). Since w t (i) = 1 for n rriving item i B t, the equlity in (4) implies tht the initil cceptnce probbility for this item is Pr[i S t ] = C t /W t. (5) The pseudocode for is given s Algorithm 2. Suppose the smple is unsturted t time t 1 in tht W t 1 < n nd hence C t 1 = W t 1 (line 5). The decy process first reduces the totl weight (nd hence the smple weight) to W t 1 = C t 1 = e λ W t 1 (line 6). then downsmples L t 1 (line 8) to reflect this decy nd mintin miniml smple footprint; the downsmpling method, described in Section 4.2, is designed to mintin the invrint in (4). If the weight of the rriving btch does not cuse the smple to overflow, i.e., C t 1 + B t < n, then C t = C t 1 + B t = W t 1 + B t = W t. The reltion in (5) then implies tht ll newly rrived items re ccepted into the smple with probbility 1 (line 9); see Figure 3() for n exmple of this scenrio. The sitution is more complicted if the weight of the rriving btch would cuse the smple to overflow. It turns out tht the simplest wy to del with this scenrio is to initilly ccept ll incoming items s in line 9, nd then run n dditionl round of downsmpling to reduce the smple weight to n (line 12), so tht the smple is now sturted; see Figure 3(b). Note tht these two steps cn be executed without ever cusing the smple footprint to exceed n. Now suppose tht the smple is sturted t time t 1, so tht W t 1 n nd hence C t 1 = S t 1 = n. The new totl weight is W t = W t 1 + B t s before (line 14). If W t n, then the weight of the rriving btch exceeds the weight loss due to decy, nd the smple remins sturted. Then (5) implies tht ech item in B t is ccepted into the smple with probbility p = n/w t. Letting I j = 1 if item j B is ccepted nd I j = otherwise, we see tht the expected number of ccepted items is [ ] m = E I j = E[I j ] = Pr[I j = 1] = B t n/w t. j B t j B t j B t There re number of possible wys to crry out this cceptnce opertion, e.g., vi independent coin flips. To minimize the vribility of the smple size (nd hence the likelihood of severely smll smples), uses stochstic rounding in line 16 nd ccepts rndom number of items M such tht M = m with probbility m m nd M = m with probbility m m, so tht E[M] = m by n rgument essentilly the sme s in (3). To mintin the bound on the smple size, the M ccepted items replce M rndomly selected victims in the current smple (line 17). If W t < n, then the smple weight decys to W t 1 nd the weight of the rriving btch is not enough to fill the smple bck up. Moreover, (5) implies tht ll rriving items re ccepted with probbility 1. Thus we downsmple to the decyed weight of W t 1 = W t B t in line 19 nd then insert the rriving items in line Downsmpling Before describing Algorithm 3, the downsmpling lgorithm, we intuitively motivte key property tht ny such procedure must hve. For ny item i L, the reltion in (4) implies tht we must hve Pr[i S] = (C/W )w i nd Pr[i S ] = (C /W )w i, where W nd w i represent the totl nd item weight before decy nd 113

6 c d e f g b c d e c d e b c d e f g c e g b c d b c d e b d b d e b c d b c d e f g f b c g () Unst. Unst. (b) Unst. St. (c) St. Unst. (d) St. St. Figure 3: scenrios for n = 4 nd e λ =.5. For simplicity, we tke W t 1 = C t 1. DS denotes downsmpling. Algorithm 3: Downsmpling 1 L = (A, π, C): input ltent smple 2 C : input trget weight with < C < C 3 L = (A, π, C ): output ltent smple 4 U orm() 5 if C = then //no full items retined 6 if U > frc(c)/c then 7 (A, π ) Swp1(A, π ) 8 A 9 else if < C = C then //no items deleted 1 if U > ( 1 (C /C) frc(c) ) / ( 1 frc(c ) ) then 11 (A, π ) Swp1(A, π ) 12 else //items deleted: < C < C 13 if U (C /C) frc(c) then 14 A Smple(A, C ) 15 (A, π ) Swp1(A, π ) 16 else 17 A Smple(A, C + 1) 18 (A, π ) Move1(A, π ) 19 if C = C then //no frctionl item 2 π downsmpling, nd W nd w i represent the weights fterwrds. Since decy ffects ll items eqully, we hve w/w = w /W, nd it follows tht Pr[i S ] = (C /C) Pr[i S]. (6) Tht is, the inclusion probbilities for ll items must be scled down by the sme frction, nmely C /C. Theorem 4.1 (lter in this section) sserts tht Algorithm 3 stisfies this property. In the pseudocode for Algorithm 3, the function orm() genertes rndom number uniformly distributed on [, 1]. The subroutine Swp1(A, π) moves rndomly selected item from A to π nd moves the current item in π (if ny) to A. Similrly, Move1(A, π) moves rndomly selected item from A to π, replcing the current item in π (if ny). More precisely, Swp1(A, π) executes the opertions I Smple(A, 1), A (A\I) π, nd π I, nd Move1(A, π) executes the opertions I Smple(A, 1), A A \ I, nd π I. To gin some intuition for why the lgorithm works, consider simple specil cse, where the gol is to form frctionl smple L = (A, π,c ) from frctionl smple L = (A, π,c) of integrl size C > C ; tht is, L comprises exctly C full items. Assume tht C is non-integrl, so tht L contins prtil item. In this cse, we simply select n item t rndom (from A) to be the prtil item in L nd then select C of the remining C 1 items t rndom to be the full items in L ; see Figure 4(). By symmetry, ech item i L is eqully likely to be included in S, so tht the inclusion probbilities for the items in L re ll scled down by the sme frction, s required for (6). For exmple, tking t = in Figure 4(), item ppers in S t with probbility 1 since it is full item. In S t, where the weights hve been reduced by 5%, item (either s full or prtil item, depending on the rndom outcome) ppers with probbility 2 (1/6) + 2 (1/6).5 =.5, s expected. This scenrio corresponds to lines 17 nd 18 in the lgorithm, where we crry out the bove selections by rndomly smpling C + 1 items from A to form A nd then choosing rndom item in A s the prtil item by moving it to π. In the cse where L contins prtil item i tht ppers in S with probbility frc(c), it follows from (6) tht i should pper in S with probbility p = (C /C)P[i S] = (C /C) frc(c). Thus, with probbility p, lines retin i nd convert it to full item so tht it ppers in S. Otherwise, in lines 17 nd 18, i is removed from the smple when it is overwritten by rndom item from A ; see Figure 4(b). Agin, new prtil item is chosen from A in rndom mnner to uniformly scle down the inclusion probbilities. For instnce, in Figure 4(b), item d ppers in S t with probbility.2 (becuse it is prtil item) nd in S t, ppers with probbility 3 (.1/3) =.1. Similrly, item ppers in S t with probbility 1 nd in S t with probbility (1.8)/6 +.6 (1.8/6) +.6 (.1/3) =.5. The if-sttement in line 5 corresponds to the corner cse in which L does not contin full item. The prtil item i L either becomes full or is swpped into A nd then immeditely ejected; see Figure 4(c). The if-sttement in line 9 corresponds to the cse in which no items re deleted from the ltent smple, e.g., when C = 4.7 nd C = 4.2. In this cse, i either becomes full by being swpped into A or remins s the prtil item for L. Denoting by ρ the probbility of not swpping, we hve P[i S ] = ρ frc(c ) + (1 ρ) 1. On the other hnd, (6) implies tht P[i S ] = (C /C) frc(c). Equting these expression shows tht ρ must equl the expression on the right side of the inequlity on line 1; see Figure 4(d). Formlly, we hve the following result. Theorem 4.1. For < C < C, let L = (A, π,c ) be the ltent smple produced from ltent smple L = (A, π,c) vi Algorithm 3, nd let S nd S be smples produced from L nd L vi (2). Then Pr[i S ] = (C /C) Pr[i S] for ll i L. 4.3 Properties of Theorem 4.2 below sserts tht stisfies (4) nd hence (1), thereby mintining the correct inclusion probbilities; see Appendix C of [16] for the proof. Theorems 4.3 nd 4.4 ssert tht, mong ll smpling lgorithms with exponentil time bising, both mximizes the expected smple size in unsturted scenrios nd minimizes smple-size vribility. Thus tends to yield more ccurte results (from more trining dt) nd greter stbility in both result qulity nd retrining costs. Theorem 4.2. The reltion Pr[i S t ] = (C t /W t )w t (i) holds for ll t 1 nd i U t. Theorem 4.3. Let H be ny smpling lgorithm tht stisfies (1) nd denote by S t nd St H the smples produced t time t by nd H. If the totl weight t some time t 1 stisfies W t < n, then E[ St H ] E[ S t ]. Proof. Since H stisfies (1), it follows tht, for ech time j t nd i B j, the inclusion probbility Pr[i St H ] must be of the 114

7 b c d b c b c b c b c d d b d c c c b c b c b b c c b b c c b c b c c b b c () From C t = 3 to C t = 1.5. form r t e λ(t j) for some function r t independent of j. Tking j = t, we see tht r t 1. For in n unsturted stte, (4) implies tht r t = C t /W t = 1, so tht Pr[i St H ] Pr[i S t ], nd the desired result follows directly. Theorem 4.4. Let H be ny smpling lgorithm tht stisfies (1) nd hs mximl expected smple size C t nd denote by S t nd S H t the smples produced t time t by nd H. Then Vr[ S H t ] Vr[ S t ] for ny time t 1. Proof. Considering ll possible distributions over the smple size hving men vlue equl toc t, it is strightforwrd to show tht vrince is minimized by concentrting ll of the probbility mss onto C t nd C t. There is precisely one such distribution, nmely the stochstic-rounding distribution, nd this is precisely the smple-size distribution ttined by. 5 DISTRIBUTED TBS ALGORITHMS In this section, we describe how to implement distributed versions of T-TBS nd to hndle lrge volumes of dt. 5.1 Overview of Distributed Algorithms The distributed T-TBS nd lgorithms, denoted s D-T- TBS nd D- respectively, need to distribute lrge dt sets cross the cluster nd prllelize the computtion on them. Overview of D-T-TBS: The implementtion of the D-T-TBS lgorithm is very similr to the simple distributed Bernoulli timebised smpling lgorithm in [27]. It is embrrssingly prllel, requiring no coordintion. At ech time point t, ech worker in the cluster subsmples its prtition of the smple with probbility p, subsmples its prtition of B t with probbility q, nd then tkes union of the resulting dt sets. Overview of D-: This lgorithm, unlike D-T-TBS, mintins bounded smple, nd hence cnnot be embrrssingly prllel. D- first needs to ggregte locl btch sizes to compute the incoming btch size B t to mintin the totl weight W. Then, bsed on B t nd the previous totl weight W, D-R- TBS determines whether the reservoir ws previously sturted nd whether it will be sturted fter processing B t. For ech possible sitution, D- chooses the items in the reservoir to delete through downsmpling nd the items in B t to insert into the reservoir. This process requires the mster to coordinte mong the workers. In Section 5.3, we introduce two lterntive pproches to determine the deleted nd inserted items. Finlly, the lgorithm pplies the deletes nd inserts to form the new reservoir, nd computes the new totl weight W. Both D-T-TBS nd D- periodiclly checkpoint the smple s well s other system stte vribles to ensure fult tolernce. The implementtion detils for D-T-TBS re mostly subsumed by those for D-, so we focus on the ltter. (b) From C t = 3.2 to C t = 1.6. (c) From C t = 2.4 to C t =.4. Figure 4: Downsmpling exmples (t = ). Reservoir Incoming Btch Key-Vlue Store () inserts (d) From C t = 2.4 to C t = 2.1. Co-Prtitioned Reservoir prtition prtition 1 prtition k prtition prtition 1 prtition k (b) inserts prtition prtition 1 prtition k prtition prtition 1 prtition k Figure 5: Design choices for implementing the reservoir prtition (, 1) prtition A B C slot (prtition, position) 1 (, 1) 5 (1, 2) 7 (1, 4) 1 (2, 2) prtition 1 (1, 2) (1, 4) Item Loctions prtition 1 D E F G H Incoming Items () prtition 2 (2, 2) prtition 2 I J K L prtition A B C Distributed Decisions 1 slot 2 slots 1 slot prtition 1 D E F G H Incoming Items (b) Figure 6: Retrieving insert items 5.2 Distributed Dt Structures prtition 2 There re two importnt dt structures in the D- lgorithm: the incoming btch nd the reservoir. Conceptully, we view n incoming btch B t s n rry of slots numbered from 1 through B t, nd the reservoir s n rry of slots numbered from 1 through C contining full items plus specil slot for the prtil item. For both dt structures, dt items need to be distributed into prtitions due to the lrge dt volumes. Therefore, the slot number of n item mps to specific prtition ID nd position inside the prtition. The incoming btch usully comes from distributed streming system, such s Sprk Streming; the ctul dt structure is specific to the streming system (e.g. n incoming btch is stored s n RDD in Sprk Streming). As result, the prtitioning strtegy of the incoming btch is opque to the D- lgorithm. Unlike the incoming btch, which is red-only nd discrded t the end of ech time period, the reservoir dt structure must be continully updted. An effective strtegy for storing nd operting on the reservoir is thus crucil for good performnce. We now explore lterntive pproches to implementing the reservoir. Distributed in-memory key-vlue store: One quite nturl pproch implements the reservoir using n off-the-shelf distributed in-memory key-vlue store, such s Redis [25] or Memcched [23]. In this scheme, ech item in the reservoir is stored s key-vlue pir, with the slot number s the key nd the item s the vlue. Inserts nd deletes to the reservoir nturlly trnslte into put nd delete opertions to the key-vlue store. I J K L 115

8 There re two mjor limittions to this pproch. Firstly, the hsh-bsed or rnge-bsed dt-prtitioning scheme used by distributed key-vlue store yields reservoir prtitions tht do not correlte with the prtitions of incoming btch. As illustrted in Figure 5(), when items from given prtition of n incoming btch re inserted into the reservoir, the inserts touch mny (if not ll) prtitions of the reservoir, incurring hevy network I/O. Secondly, key-vlue stores incur needless concurrency-control overhed. For ech btch, D- lredy crefully coordintes the deletes nd inserts so tht no two delete or insert opertions ccess the sme slots in the reservoir nd there is no dnger of write-write or red-write conflicts. Co-prtitioned reservoir: In the lterntive pproch, we implement distributed in-memory dt structure for the reservoir so s to ensure tht the reservoir prtitions coincide with the prtitions from incoming btches, s shown in Figure 5(b). This cn be chieved in spite of the unknown prtitioning scheme of the streming system. Specificlly, the reservoir is initilly empty, nd ll items in the reservoir re from the incoming btches. Therefore, if n item from given prtition of n incoming btch is lwys inserted into the corresponding locl reservoir prtition nd deletes re lso hndled loclly, then the co-prtitioning nd co-loction of the reservoir nd incoming btch prtitions is utomtic. For our experiments, we implemented the co-prtitioned reservoir in Sprk using the in-plce updting technique for RDDs in [27]; see Appendix E of [16]. Note tht, t ny point in time, given slot number in the reservoir mps to specific prtition ID nd position inside the prtition. Thus the slot number for given full item my chnge over time due to reservoir insertions nd deletions. This does not cuse ny sttisticl issues, becuse the functioning of the set-bsed lgorithm is oblivious to specific slot numbers. 5.3 Choosing Items to Delete nd Insert In order to bound the reservoir size, D- requires creful coordintion when choosing the set of items to delete from, nd insert into, the reservoir. At the sme time, D- must ensure the sttisticl correctness of rndom number genertion nd rndom permuttion opertions in the distributed environment. We consider two possible pproches. Centrlized decisions: In the most strightforwrd pproch, the mster mkes centrlized decisions bout which items to delete nd insert. For deletes, the driver genertes slot numbers of the items in the reservoir to be deleted, which re then mpped to the ctul dt loctions in mnner tht depends on the representtion of the reservoir (key-vlue store or co-prtitioned reservoir). For inserts, the driver genertes the slot numbers of the incoming items B t t time t tht need to be inserted into the reservoir. Suppose tht B t comprises k 1 prtitions. Ech generted slot number i {1, 2,..., B t } is mpped to prtition p i of the B t (where p i k 1) nd position r i inside prtition p i. Denote by Q the set of item loctions, i.e., the set of (p i, r i ) pirs. In order to perform the inserts, we need to first retrieve the ctul items bsed on the item loctions. This cn be chieved with join-like opertion between Q nd B t, with the (p i, r i ) pir mtching the ctul loction of n item inside B t. To optimize this opertion, we mke Q distributed dt structure nd use customized prtitioner to ensure tht ll pirs (p i, r i ) with p i = j re co-locted with prtition j of B t for j =, 1,..., k 1. Then co-prtitioned nd co-locted join cn be crried out between Q nd B t, s illustrted in Figure 6() for k = 3. The resulting set of retrieved insert items, denoted s S, is lso co-prtitioned with B t s by-product. After tht, the ctul deletes nd inserts re then crried out depending on how reservoir is stored, s discussed below. When the reservoir is implemented s key-vlue store, the deletes cn be directly pplied bsed on the slot numbers. For inserts, the mster tkes ech generted slot number of n item in B t nd chooses compnion destintion slot number in the reservoir into which the B t item will be inserted. This destintion reservoir slot might currently be empty due to n erlier deletion, or might contin n item tht will now be replced by the newly inserted btch item. After the ctul items to insert re retrieved s described previously, the destintion slot numbers re used to put the items into the right loctions in the key-vlue store. When the co-prtitioned reservoir is used, the delete slot numbers in the reservoir re mpped to (p i, r i ) pirs of prtitions of the reservoir nd positions inside the prtitions. As with inserts, we gin use customized prtitioner for the set of pirs R such tht deletes re co-locted with the corresponding reservoir prtitions. Then join-like opertion on R nd the reservoir performs the ctul delete opertions on the reservoir. For inserts, we simply use nother join-like opertion on the set of retrieved insert items S nd the reservoir to dd the corresponding insert items to the co-locted prtition of the reservoir. In this pproch, we don t need the mster to generte destintion reservoir slot numbers for these insert items, becuse we view the reservoir s set when using co-prtitioned reservoir dt structure. Distributed decisions: The bove pproch requires generting lrge number of slot numbers inside the mster, so we now explore n lterntive pproch tht offlods the slot number genertion to the workers while still ensuring the sttisticl correctness of the computtion. This pproch hs the mster choose only the number of deletes nd inserts per worker ccording to pproprite multivrite hypergeometric distributions. For deletes, ech worker chooses rndom victims from its locl prtition of the reservoir bsed on the number of deletes given by the mster. For inserts, the worker rndomly nd uniformly selects items from its locl prtition of the incoming btch B t given the number of inserts. Figure 6(b) depicts how the insert items re retrieved under this decentrlized pproch. We use the technique in [15] for prllel pseudo-rndom number genertion. Note tht this distributed decision mking pproch works only when the co-prtitioned reservoir dt structure is used. This is becuse the key-vlue store representtion of the reservoir requires trget reservoir slot number for ech insert item from the incoming btch, nd the trget slot numbers hve to be generted in such wy s to ensure tht, fter the deletes nd inserts, ll of the slot numbers re still unique nd contiguous in the new reservoir. This requires lot of coordintion mong the workers, which inhibits truly distributed decision mking. 6 EXPERIMENTS In this section, we study the empiricl performnce of D- nd D-T-TBS, nd demonstrte the potentil benefit of using them for model retrining in online model mngement. We implemented D- nd D-T-TBS on Sprk (refer to Appendix E of [16] for implementtion detils). Experimentl Setup: All performnce experiments were conducted on cluster of 13 IBM System x idtplex dx34 servers. Ech hs two qud-core Intel Xeon E GHz processors nd 32GB of RAM. Servers re interconnected using 1Gbit Ethernet nd ech server runs Ubuntu Linux, Jv 1.7 nd Sprk

9 Execution Time (sec) One server is dedicted to run the Sprk coordintor nd, ech of the remining 12 servers runs Sprk workers. There is one worker per processor on ech mchine, nd ech worker is given ll 4 cores to use, long with 8 GB of dedicted memory. All other Sprk prmeters re set to their defult vlues. We used Memcched s the key-vlue store in our experiments. For ll experiments, dt ws stremed in from HDFS using Sprk Streming s microbtches. We report run time per round s the verge over 1 rounds, discrding the first round from this verge becuse of Sprk strtup costs. Unless otherwise stted, ech btch contins 1 million items, the trget reservoir size is 2 million elements, nd the decy prmeter is λ = Execution Time (sec) D- D- (Cent,KV,RJ) (Cent,KV,CJ) D- (Cent,CP) D- (Dist,CP) D-T-TBS (Dist,CP) Figure 7: Per-btch distributed runtime comprison Number of Workers Figure 8: Scle out of D Runtime Performnce Execution Time (sec) x1 6 1x1 7 1x1 8 1x1 9 1x1 1 Btch Size (log scle) Figure 9: Scle up of D- Comprison of TBS Implementtions: Figure 7 shows the verge runtime per btch for five different implementtions of distributed TBS lgorithms. The first four (colored blck) re D- implementtions with different design choices: whether to use centrlized or distributed decisions (bbrevited s "Cent" nd "Dist", respectively) for choosing items to delete nd insert, nd whether to use key-vlue store for storing reservoir or coprtitioned reservoir (bbrevited s "KV" nd "CP", respectively). The first two implementtions both use the key-vlue store representtion for reservoir together with the centrlized decision strtegy for determining inserts nd deletes. They only differ in how the insert items re ctully retrieved when subsmpling the incoming btch. The first uses the stndrd reprtition join (bbrevited s "RJ"), wheres the second uses the customized prtitioner nd co-locted join (bbrevited s "CJ") s described in Section 5.3 nd depicted in Figure 6(). This optimiztion effectively cuts the network cost in hlf, but the KV representtion of reservoir still requires the insert items to be written cross the network to their corresponding reservoir loction. The third implementtion employs the co-prtitioned reservoir insted, resulting in n significnt speedup of over 2.6x. The fourth implementtion further employs the distributed decision for choosing items to delete nd insert. This yields further 1.6x speedup. We use this D- implementtion in the remining experiments. The fifth implementtion (colored grey) in Figure 7 is D-T- TBS using co-prtitioned reservoir nd the distributed strtegy for choosing delete nd insert items. Since, D-T-TBS is embrrssingly prllelizble, it s much fster thn the best D- implementtion. But, s we discussed in Section 3, T-TBS only works under very strong restriction on the dt rrivl rte, nd cn suffer from occsionl memory overflows; see Figure 1. In contrst, D- is much more robust nd works in relistic scenrios where it is hrd to predict the dt rrivl rte. Sclbility of D-: Figure 8 shows how D- scles with the number of workers. We incresed the btch size to 1 million items for this experiment. Initilly, D- scles out very nicely with the incresing number of workers. However, beyond 1 workers, the mrginl benefit from dditionl workers is smll, becuse the coordintion nd communiction overheds, s well s the inherent Sprk overhed, become prominent. For the sme resons, in the scle-up experiment in Figure 9, the runtime stys roughly constnt until the btch size reches 1 million items nd increses shrply t 1 million items. This is becuse processing the streming input nd mintining the smple strt to dominte the coordintion nd communiction overhed. With 1 workers, cn hndle dt flow comprising 1 million items rriving pproximtely every 14 seconds. 6.2 Appliction: Clssifiction using knn We now demonstrte the potentil benefits of the smpling scheme for periodiclly retrining representtive ML models in the presence of evolving dt. For ech model nd dt set, we compre the qulity of models retrined on the smples generted by, simple sliding window (), nd uniform reservoir smpling (). Due to limited spce, we do not give qulity results for T-TBS; we found tht whenever it pplies i.e. when the men btch size is known nd constnt the qulity is very similr to, since they both use time-bised smpling. Our first model is knn clssifier, where clss is predicted for ech item in n incoming btch by tking mjority vote of the clsses of the k nerest neighbors in the current smple, bsed on Eucliden distnce; the smple is then updted using the btch. To generte trining dt, we first generte 1 clss centroids uniformly in [, 8] [, 8] rectngle. Ech dt item is then generted from Gussin mixture model nd flls into one of the 1 clsses. Over time, the dt genertion process opertes in one of two modes". In the norml" mode, the frequency of items from ny of the first 5 clsses is five times higher thn tht of items in ny of the second 5 clsses. In the bnorml" mode, the frequencies re five times lower. Thus the frequent nd infrequent clsses switch roles t mode chnge. We generte ech dt point by rndomly choosing ground-truth clss c i with centroid (x i,y i ) ccording to reltive frequencies tht depend upon the current mode, nd then generting the dt point s (x,y) coordintes independently s smples from N (x i, 1) nd N (y i, 1). Here N (µ, σ) denotes the norml distribution with men µ nd stndrd devition σ. In this experiment, the btch sizes re deterministic with b = 1 items, nd k = 7 neighbors for the knn clssifier. The reservoir size for both nd is 1, nd contins the lst 1 items; thus ll methods use the sme mount of dt for retrining. (We choose this vlue becuse it chieves ner mximl clssifiction ccurcies for ll techniques. In generl, we choose smpling nd ML prmeters to chieve good lerning performnce while ensuring fir comprisons.) In ech run, the smple is wrmed up by processing 1 norml-mode btches before the clssifiction tsk begins. Our experiments focus on two types of temporl ptterns in the dt, s described below. Single chnge: Here we model the occurrence of singulr event. The dt is generted in norml mode up to t = 1 (time is mesured here in units fter wrm-up), then switches to bnorml 117

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Acceptance Sampling by Attributes

Acceptance Sampling by Attributes Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire

More information

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004 Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when

More information

Chapter 0. What is the Lebesgue integral about?

Chapter 0. What is the Lebesgue integral about? Chpter 0. Wht is the Lebesgue integrl bout? The pln is to hve tutoril sheet ech week, most often on Fridy, (to be done during the clss) where you will try to get used to the ides introduced in the previous

More information

Chapter 5 : Continuous Random Variables

Chapter 5 : Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

8 Laplace s Method and Local Limit Theorems

8 Laplace s Method and Local Limit Theorems 8 Lplce s Method nd Locl Limit Theorems 8. Fourier Anlysis in Higher DImensions Most of the theorems of Fourier nlysis tht we hve proved hve nturl generliztions to higher dimensions, nd these cn be proved

More information

Continuous Random Variables

Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

Lecture 14: Quadrature

Lecture 14: Quadrature Lecture 14: Qudrture This lecture is concerned with the evlution of integrls fx)dx 1) over finite intervl [, b] The integrnd fx) is ssumed to be rel-vlues nd smooth The pproximtion of n integrl by numericl

More information

5.7 Improper Integrals

5.7 Improper Integrals 458 pplictions of definite integrls 5.7 Improper Integrls In Section 5.4, we computed the work required to lift pylod of mss m from the surfce of moon of mss nd rdius R to height H bove the surfce of the

More information

New Expansion and Infinite Series

New Expansion and Infinite Series Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some

More information

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 UNIFORM CONVERGENCE Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 Suppose f n : Ω R or f n : Ω C is sequence of rel or complex functions, nd f n f s n in some sense. Furthermore,

More information

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (

More information

Tests for the Ratio of Two Poisson Rates

Tests for the Ratio of Two Poisson Rates Chpter 437 Tests for the Rtio of Two Poisson Rtes Introduction The Poisson probbility lw gives the probbility distribution of the number of events occurring in specified intervl of time or spce. The Poisson

More information

CS667 Lecture 6: Monte Carlo Integration 02/10/05

CS667 Lecture 6: Monte Carlo Integration 02/10/05 CS667 Lecture 6: Monte Crlo Integrtion 02/10/05 Venkt Krishnrj Lecturer: Steve Mrschner 1 Ide The min ide of Monte Crlo Integrtion is tht we cn estimte the vlue of n integrl by looking t lrge number of

More information

Math 8 Winter 2015 Applications of Integration

Math 8 Winter 2015 Applications of Integration Mth 8 Winter 205 Applictions of Integrtion Here re few importnt pplictions of integrtion. The pplictions you my see on n exm in this course include only the Net Chnge Theorem (which is relly just the Fundmentl

More information

Math 426: Probability Final Exam Practice

Math 426: Probability Final Exam Practice Mth 46: Probbility Finl Exm Prctice. Computtionl problems 4. Let T k (n) denote the number of prtitions of the set {,..., n} into k nonempty subsets, where k n. Argue tht T k (n) kt k (n ) + T k (n ) by

More information

Reinforcement learning II

Reinforcement learning II CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic

More information

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance Generl structure ECO 37 Economics of Uncertinty Fll Term 007 Notes for lectures 4. Stochstic Dominnce Here we suppose tht the consequences re welth mounts denoted by W, which cn tke on ny vlue between

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

SUMMER KNOWHOW STUDY AND LEARNING CENTRE SUMMER KNOWHOW STUDY AND LEARNING CENTRE Indices & Logrithms 2 Contents Indices.2 Frctionl Indices.4 Logrithms 6 Exponentil equtions. Simplifying Surds 13 Opertions on Surds..16 Scientific Nottion..18

More information

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite Unit #8 : The Integrl Gols: Determine how to clculte the re described by function. Define the definite integrl. Eplore the reltionship between the definite integrl nd re. Eplore wys to estimte the definite

More information

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm

More information

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying Vitli covers 1 Definition. A Vitli cover of set E R is set V of closed intervls with positive length so tht, for every δ > 0 nd every x E, there is some I V with λ(i ) < δ nd x I. 2 Lemm (Vitli covering)

More information

LECTURE NOTE #12 PROF. ALAN YUILLE

LECTURE NOTE #12 PROF. ALAN YUILLE LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of

More information

Review of basic calculus

Review of basic calculus Review of bsic clculus This brief review reclls some of the most importnt concepts, definitions, nd theorems from bsic clculus. It is not intended to tech bsic clculus from scrtch. If ny of the items below

More information

New data structures to reduce data size and search time

New data structures to reduce data size and search time New dt structures to reduce dt size nd serch time Tsuneo Kuwbr Deprtment of Informtion Sciences, Fculty of Science, Kngw University, Hirtsuk-shi, Jpn FIT2018 1D-1, No2, pp1-4 Copyright (c)2018 by The Institute

More information

Recitation 3: More Applications of the Derivative

Recitation 3: More Applications of the Derivative Mth 1c TA: Pdric Brtlett Recittion 3: More Applictions of the Derivtive Week 3 Cltech 2012 1 Rndom Question Question 1 A grph consists of the following: A set V of vertices. A set E of edges where ech

More information

Lecture 1. Functional series. Pointwise and uniform convergence.

Lecture 1. Functional series. Pointwise and uniform convergence. 1 Introduction. Lecture 1. Functionl series. Pointwise nd uniform convergence. In this course we study mongst other things Fourier series. The Fourier series for periodic function f(x) with period 2π is

More information

Numerical Integration

Numerical Integration Chpter 5 Numericl Integrtion Numericl integrtion is the study of how the numericl vlue of n integrl cn be found. Methods of function pproximtion discussed in Chpter??, i.e., function pproximtion vi the

More information

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17 EECS 70 Discrete Mthemtics nd Proility Theory Spring 2013 Annt Shi Lecture 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion,

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17 CS 70 Discrete Mthemtics nd Proility Theory Summer 2014 Jmes Cook Note 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion, y tking

More information

Credibility Hypothesis Testing of Fuzzy Triangular Distributions

Credibility Hypothesis Testing of Fuzzy Triangular Distributions 666663 Journl of Uncertin Systems Vol.9, No., pp.6-74, 5 Online t: www.jus.org.uk Credibility Hypothesis Testing of Fuzzy Tringulr Distributions S. Smpth, B. Rmy Received April 3; Revised 4 April 4 Abstrct

More information

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by. NUMERICAL INTEGRATION 1 Introduction The inverse process to differentition in clculus is integrtion. Mthemticlly, integrtion is represented by f(x) dx which stnds for the integrl of the function f(x) with

More information

APPROXIMATE INTEGRATION

APPROXIMATE INTEGRATION APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be

More information

Numerical integration

Numerical integration 2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter

More information

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a). The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples

More information

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique? XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk bout solving systems of liner equtions. These re problems tht give couple of equtions with couple of unknowns, like: 6 2 3 7 4

More information

Lecture 3 Gaussian Probability Distribution

Lecture 3 Gaussian Probability Distribution Introduction Lecture 3 Gussin Probbility Distribution Gussin probbility distribution is perhps the most used distribution in ll of science. lso clled bell shped curve or norml distribution Unlike the binomil

More information

Theoretical foundations of Gaussian quadrature

Theoretical foundations of Gaussian quadrature Theoreticl foundtions of Gussin qudrture 1 Inner product vector spce Definition 1. A vector spce (or liner spce) is set V = {u, v, w,...} in which the following two opertions re defined: (A) Addition of

More information

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1 Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution

More information

Sufficient condition on noise correlations for scalable quantum computing

Sufficient condition on noise correlations for scalable quantum computing Sufficient condition on noise correltions for sclble quntum computing John Presill, 2 Februry 202 Is quntum computing sclble? The ccurcy threshold theorem for quntum computtion estblishes tht sclbility

More information

Chapters 4 & 5 Integrals & Applications

Chapters 4 & 5 Integrals & Applications Contents Chpters 4 & 5 Integrls & Applictions Motivtion to Chpters 4 & 5 2 Chpter 4 3 Ares nd Distnces 3. VIDEO - Ares Under Functions............................................ 3.2 VIDEO - Applictions

More information

Lecture 13 - Linking E, ϕ, and ρ

Lecture 13 - Linking E, ϕ, and ρ Lecture 13 - Linking E, ϕ, nd ρ A Puzzle... Inner-Surfce Chrge Density A positive point chrge q is locted off-center inside neutrl conducting sphericl shell. We know from Guss s lw tht the totl chrge on

More information

1 Probability Density Functions

1 Probability Density Functions Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our

More information

MAA 4212 Improper Integrals

MAA 4212 Improper Integrals Notes by Dvid Groisser, Copyright c 1995; revised 2002, 2009, 2014 MAA 4212 Improper Integrls The Riemnn integrl, while perfectly well-defined, is too restrictive for mny purposes; there re functions which

More information

3.4 Numerical integration

3.4 Numerical integration 3.4. Numericl integrtion 63 3.4 Numericl integrtion In mny economic pplictions it is necessry to compute the definite integrl of relvlued function f with respect to "weight" function w over n intervl [,

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams Chpter 4 Contrvrince, Covrince, nd Spcetime Digrms 4. The Components of Vector in Skewed Coordintes We hve seen in Chpter 3; figure 3.9, tht in order to show inertil motion tht is consistent with the Lorentz

More information

Monte Carlo method in solving numerical integration and differential equation

Monte Carlo method in solving numerical integration and differential equation Monte Crlo method in solving numericl integrtion nd differentil eqution Ye Jin Chemistry Deprtment Duke University yj66@duke.edu Abstrct: Monte Crlo method is commonly used in rel physics problem. The

More information

Student Activity 3: Single Factor ANOVA

Student Activity 3: Single Factor ANOVA MATH 40 Student Activity 3: Single Fctor ANOVA Some Bsic Concepts In designed experiment, two or more tretments, or combintions of tretments, is pplied to experimentl units The number of tretments, whether

More information

Lecture 1: Introduction to integration theory and bounded variation

Lecture 1: Introduction to integration theory and bounded variation Lecture 1: Introduction to integrtion theory nd bounded vrition Wht is this course bout? Integrtion theory. The first question you might hve is why there is nything you need to lern bout integrtion. You

More information

Numerical Analysis: Trapezoidal and Simpson s Rule

Numerical Analysis: Trapezoidal and Simpson s Rule nd Simpson s Mthemticl question we re interested in numericlly nswering How to we evlute I = f (x) dx? Clculus tells us tht if F(x) is the ntiderivtive of function f (x) on the intervl [, b], then I =

More information

ODE: Existence and Uniqueness of a Solution

ODE: Existence and Uniqueness of a Solution Mth 22 Fll 213 Jerry Kzdn ODE: Existence nd Uniqueness of Solution The Fundmentl Theorem of Clculus tells us how to solve the ordinry differentil eqution (ODE) du = f(t) dt with initil condition u() =

More information

We partition C into n small arcs by forming a partition of [a, b] by picking s i as follows: a = s 0 < s 1 < < s n = b.

We partition C into n small arcs by forming a partition of [a, b] by picking s i as follows: a = s 0 < s 1 < < s n = b. Mth 255 - Vector lculus II Notes 4.2 Pth nd Line Integrls We begin with discussion of pth integrls (the book clls them sclr line integrls). We will do this for function of two vribles, but these ides cn

More information

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7 CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees

More information

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007 A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H Thoms Shores Deprtment of Mthemtics University of Nebrsk Spring 2007 Contents Rtes of Chnge nd Derivtives 1 Dierentils 4 Are nd Integrls 5 Multivrite Clculus

More information

Deteriorating Inventory Model for Waiting. Time Partial Backlogging

Deteriorating Inventory Model for Waiting. Time Partial Backlogging Applied Mthemticl Sciences, Vol. 3, 2009, no. 9, 42-428 Deteriorting Inventory Model for Witing Time Prtil Bcklogging Nit H. Shh nd 2 Kunl T. Shukl Deprtment of Mthemtics, Gujrt university, Ahmedbd. 2

More information

Improper Integrals, and Differential Equations

Improper Integrals, and Differential Equations Improper Integrls, nd Differentil Equtions October 22, 204 5.3 Improper Integrls Previously, we discussed how integrls correspond to res. More specificlly, we sid tht for function f(x), the region creted

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature CMDA 4604: Intermedite Topics in Mthemticl Modeling Lecture 19: Interpoltion nd Qudrture In this lecture we mke brief diversion into the res of interpoltion nd qudrture. Given function f C[, b], we sy

More information

Numerical Integration. 1 Introduction. 2 Midpoint Rule, Trapezoid Rule, Simpson Rule. AMSC/CMSC 460/466 T. von Petersdorff 1

Numerical Integration. 1 Introduction. 2 Midpoint Rule, Trapezoid Rule, Simpson Rule. AMSC/CMSC 460/466 T. von Petersdorff 1 AMSC/CMSC 46/466 T. von Petersdorff 1 umericl Integrtion 1 Introduction We wnt to pproximte the integrl I := f xdx where we re given, b nd the function f s subroutine. We evlute f t points x 1,...,x n

More information

Lecture 19: Continuous Least Squares Approximation

Lecture 19: Continuous Least Squares Approximation Lecture 19: Continuous Lest Squres Approximtion 33 Continuous lest squres pproximtion We begn 31 with the problem of pproximting some f C[, b] with polynomil p P n t the discrete points x, x 1,, x m for

More information

Administrivia CSE 190: Reinforcement Learning: An Introduction

Administrivia CSE 190: Reinforcement Learning: An Introduction Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these

More information

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as Improper Integrls Two different types of integrls cn qulify s improper. The first type of improper integrl (which we will refer to s Type I) involves evluting n integrl over n infinite region. In the grph

More information

The steps of the hypothesis test

The steps of the hypothesis test ttisticl Methods I (EXT 7005) Pge 78 Mosquito species Time of dy A B C Mid morning 0.0088 5.4900 5.5000 Mid Afternoon.3400 0.0300 0.8700 Dusk 0.600 5.400 3.000 The Chi squre test sttistic is the sum of

More information

Calculus I-II Review Sheet

Calculus I-II Review Sheet Clculus I-II Review Sheet 1 Definitions 1.1 Functions A function is f is incresing on n intervl if x y implies f(x) f(y), nd decresing if x y implies f(x) f(y). It is clled monotonic if it is either incresing

More information

Math 360: A primitive integral and elementary functions

Math 360: A primitive integral and elementary functions Mth 360: A primitive integrl nd elementry functions D. DeTurck University of Pennsylvni October 16, 2017 D. DeTurck Mth 360 001 2017C: Integrl/functions 1 / 32 Setup for the integrl prtitions Definition:

More information

Riemann is the Mann! (But Lebesgue may besgue to differ.)

Riemann is the Mann! (But Lebesgue may besgue to differ.) Riemnn is the Mnn! (But Lebesgue my besgue to differ.) Leo Livshits My 2, 2008 1 For finite intervls in R We hve seen in clss tht every continuous function f : [, b] R hs the property tht for every ɛ >

More information

Entropy and Ergodic Theory Notes 10: Large Deviations I

Entropy and Ergodic Theory Notes 10: Large Deviations I Entropy nd Ergodic Theory Notes 10: Lrge Devitions I 1 A chnge of convention This is our first lecture on pplictions of entropy in probbility theory. In probbility theory, the convention is tht ll logrithms

More information

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties; Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

and that at t = 0 the object is at position 5. Find the position of the object at t = 2.

and that at t = 0 the object is at position 5. Find the position of the object at t = 2. 7.2 The Fundmentl Theorem of Clculus 49 re mny, mny problems tht pper much different on the surfce but tht turn out to be the sme s these problems, in the sense tht when we try to pproimte solutions we

More information

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction Lesson : Logrithmic Functions s Inverses Prerequisite Skills This lesson requires the use of the following skills: determining the dependent nd independent vribles in n exponentil function bsed on dt from

More information

Math 113 Fall Final Exam Review. 2. Applications of Integration Chapter 6 including sections and section 6.8

Math 113 Fall Final Exam Review. 2. Applications of Integration Chapter 6 including sections and section 6.8 Mth 3 Fll 0 The scope of the finl exm will include: Finl Exm Review. Integrls Chpter 5 including sections 5. 5.7, 5.0. Applictions of Integrtion Chpter 6 including sections 6. 6.5 nd section 6.8 3. Infinite

More information

P 3 (x) = f(0) + f (0)x + f (0) 2. x 2 + f (0) . In the problem set, you are asked to show, in general, the n th order term is a n = f (n) (0)

P 3 (x) = f(0) + f (0)x + f (0) 2. x 2 + f (0) . In the problem set, you are asked to show, in general, the n th order term is a n = f (n) (0) 1 Tylor polynomils In Section 3.5, we discussed how to pproximte function f(x) round point in terms of its first derivtive f (x) evluted t, tht is using the liner pproximtion f() + f ()(x ). We clled this

More information

Line and Surface Integrals: An Intuitive Understanding

Line and Surface Integrals: An Intuitive Understanding Line nd Surfce Integrls: An Intuitive Understnding Joseph Breen Introduction Multivrible clculus is ll bout bstrcting the ides of differentition nd integrtion from the fmilir single vrible cse to tht of

More information

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior Reversls of Signl-Posterior Monotonicity for Any Bounded Prior Christopher P. Chmbers Pul J. Hely Abstrct Pul Milgrom (The Bell Journl of Economics, 12(2): 380 391) showed tht if the strict monotone likelihood

More information

Non-Linear & Logistic Regression

Non-Linear & Logistic Regression Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University) Regression Anlyses When do we use these? PART 1: find

More information

2D1431 Machine Learning Lab 3: Reinforcement Learning

2D1431 Machine Learning Lab 3: Reinforcement Learning 2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed

More information

DISCRETE MATHEMATICS HOMEWORK 3 SOLUTIONS

DISCRETE MATHEMATICS HOMEWORK 3 SOLUTIONS DISCRETE MATHEMATICS 21228 HOMEWORK 3 SOLUTIONS JC Due in clss Wednesdy September 17. You my collborte but must write up your solutions by yourself. Lte homework will not be ccepted. Homework must either

More information

Chapter 14. Matrix Representations of Linear Transformations

Chapter 14. Matrix Representations of Linear Transformations Chpter 4 Mtrix Representtions of Liner Trnsformtions When considering the Het Stte Evolution, we found tht we could describe this process using multipliction by mtrix. This ws nice becuse computers cn

More information

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations ME 3600 Control ystems Chrcteristics of Open-Loop nd Closed-Loop ystems Importnt Control ystem Chrcteristics o ensitivity of system response to prmetric vritions cn be reduced o rnsient nd stedy-stte responses

More information

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed

More information

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.) MORE FUNCTION GRAPHING; OPTIMIZATION FRI, OCT 25, 203 (Lst edited October 28, 203 t :09pm.) Exercise. Let n be n rbitrry positive integer. Give n exmple of function with exctly n verticl symptotes. Give

More information

Bernoulli Numbers Jeff Morton

Bernoulli Numbers Jeff Morton Bernoulli Numbers Jeff Morton. We re interested in the opertor e t k d k t k, which is to sy k tk. Applying this to some function f E to get e t f d k k tk d k f f + d k k tk dk f, we note tht since f

More information

Stuff You Need to Know From Calculus

Stuff You Need to Know From Calculus Stuff You Need to Know From Clculus For the first time in the semester, the stuff we re doing is finlly going to look like clculus (with vector slnt, of course). This mens tht in order to succeed, you

More information

Convergence of Fourier Series and Fejer s Theorem. Lee Ricketson

Convergence of Fourier Series and Fejer s Theorem. Lee Ricketson Convergence of Fourier Series nd Fejer s Theorem Lee Ricketson My, 006 Abstrct This pper will ddress the Fourier Series of functions with rbitrry period. We will derive forms of the Dirichlet nd Fejer

More information

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac REVIEW OF ALGEBRA Here we review the bsic rules nd procedures of lgebr tht you need to know in order to be successful in clculus. ARITHMETIC OPERATIONS The rel numbers hve the following properties: b b

More information

4. GREEDY ALGORITHMS I

4. GREEDY ALGORITHMS I 4. GREEDY ALGORITHMS I coin chnging intervl scheduling scheduling to minimize lteness optiml cching Lecture slides by Kevin Wyne Copyright 2005 Person-Addison Wesley http://www.cs.princeton.edu/~wyne/kleinberg-trdos

More information

Probabilistic Investigation of Sensitivities of Advanced Test- Analysis Model Correlation Methods

Probabilistic Investigation of Sensitivities of Advanced Test- Analysis Model Correlation Methods Probbilistic Investigtion of Sensitivities of Advnced Test- Anlysis Model Correltion Methods Liz Bergmn, Mtthew S. Allen, nd Dniel C. Kmmer Dept. of Engineering Physics University of Wisconsin-Mdison Rndll

More information

Polynomial Approximations for the Natural Logarithm and Arctangent Functions. Math 230

Polynomial Approximations for the Natural Logarithm and Arctangent Functions. Math 230 Polynomil Approimtions for the Nturl Logrithm nd Arctngent Functions Mth 23 You recll from first semester clculus how one cn use the derivtive to find n eqution for the tngent line to function t given

More information

2.4 Linear Inequalities and Interval Notation

2.4 Linear Inequalities and Interval Notation .4 Liner Inequlities nd Intervl Nottion We wnt to solve equtions tht hve n inequlity symol insted of n equl sign. There re four inequlity symols tht we will look t: Less thn , Less thn or

More information