Modeling Residual-Geometric Flow Sampling

Size: px
Start display at page:

Download "Modeling Residual-Geometric Flow Sampling"

Transcription

1 1 Modeling Residual-Geometric Flow Samling Xiaoming Wang, Xiaoyong Li, and Dmitri Loguinov Abstract Traffic monitoring and estimation of flow arameters in high seed routers have recently become challenging as the Internet grew in both scale and comlexity. In this aer, we focus on a family of flow-size estimation algorithms we call Residual- Geometric Samling (RGS), which generates a random oint within each flow according to a geometric random variable and records all remaining ackets in a flow counter. Our analytical investigation shows that revious estimation algorithms based on this method exhibit certain bias in recovering flow statistics from the samled measurements. To address this roblem, we derive a novel set of unbiased estimators for RGS, validate them using real Internet traces, and show that they rovide an accurate and scalable solution to Internet traffic monitoring. I. INTRODUCTION Recent growth of the Internet in both scale and comlexity has imosed a number of challenges on network management, oeration, and traffic monitoring. The main roblem in this line of work is to scale measurement algorithms to achieve certain objectives (e.g., accuracy) while satisfying real-time resource constraints (e.g., fixed memory consumtion and er-acket rocessing delay) of high-seed Internet routers. This is commonly accomlished (e.g., [5], [6], [7], [8], [9], [10], [11], [14], [15], [17], [21], [18], [19], [20], [22], [26], [32]) by reducing the amount of information a router has to store in its internal tables, which comes at the exense of deloying secial estimation techniques that can recover metrics of interest from the collected samles. In this aer, we study two roblems in the general area of measuring s 1) determining the number of ackets transmitted by elehant flows [11], [15], [17], [21], [20], [22] and 2) building the distribution of s seen by the router in some time window [7], [18], [32] couled in a single measurement technique. The former roblem arises in usage-based accounting and traffic engineering [6], [11], [12], [13], [27], while the latter has many security alications such as anomaly and intrusion detection [1], [23], [16]. Our interest falls within the family of residual samling, which selects a random oint A within each flow and then samles the remainder R of that flow until it ends. Denoting by L the size (in ackets) of a random flow, samled residuals R are simly L A. Stochastically larger A results in fewer flows being samled and leads to lower overhead in terms of both CPU and RAM consumtion. Besides reduced overhead arising from omission of many small-size flows from counter tables, residual samling guarantees to cature large flows with robability 1 o(1) as their size L. This allows ISPs A shorter version of this aer aeared in IEEE INFOCOM Xiaoming Wang is with Amazon.com, Seattle, WA USA (xmwang@gmail.com). Xiaoyong Li and Dmitri Loguinov are with Texas A&M University, College Station, TX USA ({xiaoyong, dmitri}@cse.tamu.edu). to determine heavy-hitters and charge the corresonding customers for generated traffic. While in P2P networks residual samling distributes the initial oint A uniformly within user lifetimes [31], flow-based estimation [11], [17] usually emloys geometric A since it can be easily imlemented with a sequence of indeendent Bernoulli variables. We call the resulting aroach Residual- Geometric Samling (RGS) and note that it has received some limited analytical attention in [11], [17]; however, unbiased estimation of individual s, analysis of the resulting error, asymtotically accurate recovery of flow-size distribution P (L = i) from samled residuals R, and analysis of sace- CPU requirements in steady state have not been exlored. We overcome these issues below. A. Single-Flow Usage We start with the roblem of obtaining sizes of individual flows for accounting uroses. Since residual samling requires an estimator to convert residuals into the metrics of interest, our first task is to define roer notation and desired roerties for the estimation algorithm. Assume that for a flow of size L the samling algorithm roduces residual R L, where both L and R L are random variables. We call an estimator e(r L ) unbiased if its exectation roduces the correct flow size, i.e., E[e(R L ) L = l] = E[e(R l )] = l. Unbiased estimation allows one to average the estimated size of several flows of a given size l and accurately estimate their total contribution. We further call an estimator elehant-accurate if ratio e(r l )/l converges to 1 in mean-square as l. Elehant-accuracy ensures that the variance of e(r l )/l tends to zero as l, which means that the amount of relative error between e(r l ) and l becomes negligible for large flows. Prior work on RGS [11], [17] has suggested the following estimator: e(r L ) = R L 1 + 1/, (1) where 0 < 1 is the arameter of geometric variable A. To understand the erformance of (1), we first build a general robabilistic for residual-geometric samling and derive the relationshi between L and its residual R L. Using this result, we rove that: E[e(R l )] = l 1 (1 ) l, (2) which indicates that (2) is generally biased and on average tends to overestimate the original by a factor of u to 1/. To address this roblem, we derive a different estimator: ê(r L ) = R L 1 + 1/ (1 )R L (3) and rove that it is both unbiased and elehant-accurate. We also derive in closed-form the mean-square error δ l =

2 2 E[(ê(R l )/l 1) 2 ] for finite l, which can be used to determine when (3) aroximates the true with accuracy sufficient for billing uroses. B. Flow-Size Distribution Our second roblem is estimation of the original flow-size robability mass function (PMF), which we assume is given by f i = P (L = i), i = 1, 2,... We call PMF estimator q i asymtotically unbiased if it converges in robability to f i for all i as the number of samled flows M. One may be at first temted to comute this distribution based on the values roduced by either (1) or (3) for each observed flow; however, we show that such q i almost always differ from the original distribution f i and the bias ersists as samle size M. The reason for this discreancy is that e(.) and ê(.) both estimate the sizes of flows that have been samled by the algorithm, which are not reresentative of the entire oulation assing through the router. Since longer flows are more likely to be selected by residual samling, this aroach severely overestimates their fraction and thus skews the PMF towards the tail. Denote by M i the number of samled flows with R L = i and define a new estimator: q i = M i (1 )M i+1 M + (1 )M 1. (4) Using the general of RGS derived later in the aer, we rove that q i tends to f i in robability as M = i M i and obtain the amount of error q i f i for finite M. We also rovide asymtotically unbiased estimators for the total number of flows n: ñ = M + 1 M 1 (5) and the number of flows n i with exactly i ackets: ñ i = M i (1 )M i+1, (6) where ñ/n 1 and ñ i /n i 1, both in robability, as M. We call the resulting combination (3),(4)-(6) Unbiased Residual-Geometric Estimators (URGE). C. Imlementation and Evaluation We finish the aer by discussing an efficient imlementation of the above algorithms and evaluating their accuracy/erformance using several Internet traces. Prior work has not discussed how residual samling should be imlemented or its overhead in steady-state, which romts a fairly detailed exosition below. We assume URGE uses a chain-linked hash table of size K, which kees individual flow counters. Each linked list is sorted according to the flow ID and is traversed linearly until a match is found or an ID larger than the one being sought is encountered. Keeing the list sorted (as oosed to FIFO) reduces the looku delay by half for flows not already in the table. To reduce RAM overhead, we remove flows from the table if they have comleted (i.e., FIN, RST ackets detected) or if no ackets from these flows arrive within some timeout τ. To kee the overhead manageable, the removal rocess is run over the entire table on the timescale of seconds or even minutes. As before, assume that the router sees a total of n flows in window [0, T ]. Then, denote by N(t) the number of active flows at time t and by M(t) the number of them samled by the router. It then follows that memory consumtion W R (t) and looku delay is T R (t) are both functions of M(t). Under certain mild assumtions, we obtain a simle result on E[M(t)] and show that even as the total number of flows n, both RAM usage and CPU overhead of RGS remain constant. We then exlore how to satisfy the tradeoff between three design objectives memory consumtion, rocessing seed, and accuracy using arameters K and. Given uer bounds on memory usage W 0 and er-acket rocessing delay T 0, we roose a technique for deciding K based on the above analysis such that W R (t) W 0 and T R (t) T 0 are satisfied, while maximizing at the same time (i.e., achieving the best accuracy within the constraints). We finish the aer by evaluating URGE with real Internet traces obtained from NLANR [24] and CAIDA [3]. Our exeriments reveal that the roosed algorithm roduces very accurate estimation of flow metrics and thus allows one to erform more aggressive samling (i.e., smaller robability ) of the monitored traffic. With = 0.01, we find that E[M(t)] is times smaller than n and times smaller than E[M], with most lookus requiring just 1-2 RAM hits. We also discover in the exeriments with small traces that URGE does not degrade significantly in terms of accuracy even for small samle sizes, which makes it suitable for monitoring individual customer networks and certain rotocols. The remainder of the aer is organized as follows. We review rior work on traffic monitoring in Section II. We then develo a robabilistic for residual-geometric samling in Section III, analyze revious methods in Section IV, and roose the new estimators in Section V. We exlore the imlementation of the suggested framework in Section VI, evaluate its erformance in Section VII, and conclude the aer in Section VIII. II. RELATED WORK In this section, we review several samling algorithms in the area of traffic monitoring. In articular, we classify existing work into two categories: acket samling and flow samling, where the former makes er-acket and the latter er-flow decisions to samle incoming traffic. A. Packet Samling Samled NetFlow (SNF) [26] is a widely used technique in which incoming ackets are samled with a fixed robability. The general goal of SNF is to obtain the PMF of flow sizes; however, [14] shows that it is imossible to accurately recover the original flow-size distribution from samled SNF data. Estan et al. [10] roose Adative NetFlow (ANF), which adjusts the samling robability according to the size of

3 3 the flow table; however, ANF s bias in the samled data is equivalent to that in SNF and is similarly difficult to overcome in ractice. Instead of using one uniform robability for all flows as in [10], [26], another direction in acket samling is to comute i (c) for each flow i based on its currently observed size c. This aroach has been studied by two indeendent aers, Sketch-Guided Samling (SGS) [20] and Adative Non-Linear Samling (ANLS) [15]. A common feature of these two methods is to samle a new flow with robability 1 and then monotonically decrease i (c) as c grows. Both methods must maintain a counter for each flow resent in the network and are difficult to scale due to the high RAM/CPU usage. B. Flow Samling In flow thinning [14], each flow is samled indeendently with robability and then all ackets in samled flows are counted. Hohn et al. [14] show that flow thinning is able to accurately estimate the distribution; however, this method tyically misses 1 ercent of elehant flows and thus does not suort alications such as usage-based accounting and traffic engineering [6], [11], [12], [13], [27]. For highly skewed distributions with a few extremely large flows and many short ones (which is tyical for Internet links), this method may also take a long time to converge. To address these roblems of flow thinning, Estan et. al. [11] introduce a size-deendent flow samling algorithm called Samle-and-Hold (S&H), which is roosed to identify elehant flows. For each acket from a new flow, the algorithm creates a flow counter with robability ; once a flow is samled, all of its subsequent ackets are then counted. It is easy to verify that S&H samles a flow with size l with robability 1 (1 ) l, which quickly aroaches 1 as l grows. Creating a unifying analytical for this aroach and understanding the roerties of samles it collects is the main toic of this aer. Another direction of size-deendent flow samling has been exlored by Duffield et al. in [5], [6], [8], which resent another size-deendent flow measurement method called Smart Samling. Their aroach selects each flow of size L with robability (L) = min(1, L/z), where z is some constant. Since this method requires L before deciding whether to samle it or not, it can only be alied off-line. Komella et. al. [17] examine a method called Flow Slicing (FS), which combines SNF and S&H with a variant of smart samling. Other non-samling methods include exact counting [25], [28], [30], [33] and lossy counting [18], [22], which are orthogonal to our work. III. UNDERLYING MODEL In this section, we build a general robabilistic of Samle-and-Hold [11] and establish the necessary analytical foundation for the results that follow. A. Samle-and-Hold Consider a sequence of ackets traversing a router and assume that its flow-measurement algorithm checks each ackets in a flow Age A L L Residual R L Fig. 1. Residual-geometric samling of a flow with size L. discarded acket samled acket acket s flow identifier x in some RAM table. If x is found in the table, the corresonding counter is incremented by 1; otherwise, with robability a new entry for x is created in the table (with counter value 1) and with robability 1 the acket is ignored. To this rocess, we first need several definitions. Assume that s are i.i.d. random variables and define geometric age A L to be the number of ackets discarded from the front of a flow with size L before it is samled (see Fig. 1). Let G be a shifted geometric random variable with success robability, i.e., P (G = j) = (1 ) j. It thus follows that A L is simly: A L = min(g, L). (7) Now define geometric residual R L to be the final counter value of a flow of size L conditioned on the fact that it has been samled (i.e., A L < L): R L = L A L, (8) which is also illustrated in Fig. 1. From the ersective of traffic monitoring in this aer, geometric residual R L is the only quantity collected during measurement and available to an estimation algorithm. Since this aroach belongs to the class of residual-samling techniques [31] and secifically uses geometric age, this aer calls S&H by a more mathematicallysecific name Residual-Geometric Samling (RGS). Assume that L has a PMF f i = P (L = i), where i = 1, 2,..., and denote by s = P (A L < L) the robability that a random flow is samled. Then, we have the following result. Lemma 1: Probability s that a flow is selected by RGS is: s = E[1 (1 ) L ] = 1 f i (1 ) i. (9) Proof: Observe that for a fixed L = l, we have P (A l < l) = 1 (1 ) l. Unconditioning L, we immediately get (9). Next, let h i = P (R L = i) be the PMF of geometric residual R L. The following lemma exresses h i in terms of f i. Lemma 2: The PMF of geometric residual R L is: h i = j=i f j(1 ) j i. (10) s Proof: Using (8), we have: i=1 h i = P (R L = i) = P (L A L = i A L < L) = P (L A L = i A L < L) s, (11) where s = P (A L < L). Substituting (7) into (11) and combining the fact that L G = i 1, we establish: P (L G = i) j=0 h i = = P (G = j i)f j, (12) s s

4 4 which gives the desired result in (10) by substituting the PMF of G into (12). The result of Lemma 2 is fundamental as most of the results in this aer are conveniently derived from (10). estimated size estimated size B. Fixed Flow Size We next analyze a secial case of residual samling where the original is fixed at L = l. Note that residuals are now R l instead of R L since the original is no longer a random variable. Recall that the goal of single-flow size estimation is to obtain l from R l for each samled flow. The next corollary follows from (10) and gives the distribution and exectation of geometric residual R l. Corollary 1: Given L = l, the PMF of R l is: and its exectation is: P (R l = i) = (1 )l i 1 (1 ) l (13) E[R l ] = l + 1 1/. (14) 1 (1 ) l Proof: For L = l, we have f l = 1 and f i = 0 for all i l. Writing s = 1 (1 ) l, we get from (10): j=i P (R l = i) = f j(1 ) j i 1 (1 ) l = (1 )l i 1 (1 ) l, (15) which is exactly (13). We next derive exectation E[R l ], which can be exanded into: E[R l ] = E[l A l A l < l] = l E[G G < l]. (16) Recall that for any non-negative discrete random variable Y taking values over the integer set {0, 1,...}, its exectation is given by E[Y ] = y=0 P (Y > y). It thus follows that (16) reduces to: l 1 E[R l ] = l P (G > j G < l) j=0 l 1 = P (G j G < l) = j=0 l 1 j=0 P (G j). (17) P (G < l) Substituting P (G j) = 1 (1 ) j+1 into (17), we have: l 1 j=0 E[R l ] = [1 (1 )j+1 ] 1 (1 ) l = l (1 )(1 (1 )l )/ 1 (1 ) l, (18) which can be simlified to (14). Next, we aly the results obtained in this section to analyze existing estimation methods that have been roosed for RGS. IV. ANALYSIS OF EXISTING METHODS In this section, we examine rior aroaches [11], [17] to estimating single-flow usage and whether their results can be generalized to recover the PMF of L. unbiased (a) = 0.01 unbiased (b) = Fig. 2. Exectation of estimator (19) in s and its (20). A. Single-Flow Usage To evaluate single-flow estimators, we use the following definition that is commonly used in statistics [2]. Definition 1: Estimator e(r l ) is called unbiased if E[e(R l )] = l for all l 1. Unbiased estimation is a key roerty of an estimator as it allows accurate estimation of the total contribution from a sufficiently large ool of flows (e.g., one customer network). However, since large flows are tyically rare, one commonly faces an additional requirement to estimate their size with just a single samle e(r l ), which is formalized in the next definition. Definition 2: Estimator e(r l ) is called elehant-accurate if e(r l )/l 1 in mean-square as l. Elehant-accuracy guarantees that the amount of relative error between e(r l ) and l decays to zero as l. As before, suose that a flow of size l roduces a counter with value R l. Recall that [11], [17] suggest the following estimator: e(r l ) = R l 1 + 1/, (19) where is the robability of residual-geometric samling. The next result directly follows from (14). Theorem 1: Exectation E[e(R l )] is given by: E[e(R l )] = l 1 (1 ) l. (20) Proof: Taking the exectation of (19), we have: E[e(R l )] = E[R l ] 1 + 1/, (21) which immediately leads to (20) using (14). Note that (20) indicates that (19) is generally biased, esecially when l is small. Indeed, for l 0, we have 1 (1 ) l l and E[e(R l )] 1/ regardless of l, which shows that in such cases E[e(R l )] carries no information about the original. However, as l, it is straightforward to verify that the bias in e(r l ) vanishes exonentially, which is consistent with the analysis in [17], which has only considered the case of l. To see the extent of bias in (19) and verify (20), we aly residual-geometric samling to flows of size l ranging from 1 to 10 6 ackets, feed the measured sizes to (19), and average the result after 1000 iterations for each l. Fig. 2 lots the obtained E[e(R l )] along with (20). The figure indicates that (20) indeed catures the bias and that (19) tends to overestimate the size of short flows even in exectation, where smaller samling robability leads to more error.

5 relative RMSE (a) = 0.01 relative RMSE (b) = Fig. 3. RRMSE of (19) in s and its (23) (a) = (b) = Fig. 4. Distribution {q i } in s and its (24). To quantify the error of individual values e(r l ) in estimating l and to understand elehant-accuracy, denote by Y l = e(r l )/l and define the Relative Root Mean Square Error (RRMSE) to be: δ l = E[(Y l 1) 2 ]. (22) Note that δ l 0 indicates that Y l 1 in mean-square and thus imlies elehant-accurate estimation. The next result derives δ l in closed form. We omit the rather tedious derivations for brevity. Theorem 2: The RRMSE of (19) is given by: 1 l(l 1) δ l = 2 (1 ) l (1 ) l+1 l 2 2 (1 (1 ) l. (23) ) Observe from (23) that for flows with size l = 1, the relative error is 1 /, but as l, δ l 0 and the estimator is elehant-accurate. Fig. 3 lots (23) against s, indicating a close match. The figure also shows that the RRMSE starts from 1/ and decreases towards zero as Θ(1/l) as l. B. Flow-Size Distribution We now investigate whether e(r L ) defined in (19) can be used to estimate the actual flow-size distribution {f i } i=1. Denote by q i = P (e(r L ) = i) the PMF of estimated sizes among the samled flows. To understand our objectives with aroximating the PMF of L, a definition is in order. Definition 3: An estimator {q i } i=1 of PMF {f i} i=1 is called asymtotically unbiased if q i converges in robability to f i for all i as the number of samled flows M. The next theorem follows directly from (10). Theorem 3: The PMF of s estimated from (19) is given by: j=y(i) q i = f j(1 ) j y(i), (24) s where y(i) = i + 1 1/ and s is in (9). The result in (24) indicates that each q i is different from f i regardless of the samling duration and thus cannot be used to aroximate the flow-size distribution. We verify (24) with a simulated acket stream with 5M flows, where flow sizes follow a ower-law distribution P (L i) = 1 i α for i = 1, 2,... and α = 1.1. Fig. 4 lots the CCDF of random variable e(r L ) obtained from s as well as (24), both in comarison to the tail of the actual distribution. The figure shows that (24) accurately redicts the values obtained from s and that PMF {q i } is indeed quite different from {f i }. So far, our study of existing methods in residual-geometric samling has shown that they are not only generally biased, but also unable to recover the flow-size distribution from residuals R L. This motivates us to seek better estimation aroaches, which we erform next. V. URGE This section rooses a family of algorithms called Unbiased Residual-Geometric Estimators (URGE), roves their accuracy, and verifies them in s. A. Single-Flow Usage For estimating individual s, we first consider an estimator directly imlied by the result in (14). Notice that solving (14) for l and exressing l in terms of E[R l ], we get: 1 ( ) l = u log(1 ) W u(1 ) u log(1 ), (25) where u = E[R l ] + 1/ 1 and W (z) is Lambert s function (i.e., a multi-valued solution to W e W = z) [4]. Thus, a ossible estimator can be comuted from (25) with E[R l ] relaced by the measured value of geometric residual R l. However, there are two reasons that (25) is a bad estimator of s. First, Lambert s function W (z) has no closed form solution and has to be numerically solved using tools such as Matlab. Second, it can be verified (not shown here for brevity) that (25) is not an unbiased estimator. Instead, we define a new estimator: ê(r l ) = R l 1 + 1/ (1 )R l. (26) and next show that it is unbiased. Lemma 3: Estimator ê(r l ) in (26) is unbiased, i.e., E[ê(R l )] = l. (27) Proof: We rove (27) by deriving such function ψ(r l ) that satisfies E[ψ(R l )] = l. First, it follows from (13) that: l E[ψ(R l )] = ψ(j)p (R l = j) = j=1 l j=1 ψ(j)(1 )l j 1 (1 ) l. (28)

6 6 estimated size unbiased Fig. 5. relative RMSE (a) = 0.01 Exectation of estimator (26) in s (a) = 0.01 Fig. 6. RRMSE of (26) in s and (31). For E[ψ(R l )] = l to hold, we must have: l j=1 estimated size relative RMSE unbiased (b) = (b) = ψ(j)(1 ) j = l ( 1 (1 ) l) (1 ) l. (29) Writing (29) twice for l and l 1 and subtracting the two equations from each other, we get: ψ(l)(1 ) l = 1 + (l 1) (1 )l (1 ) l. (30) Simlifying (30), we obtain (26). We lot in Fig. 5 results obtained from (26). The figure indicates that ê(r l ) accurately estimates s for all flows in both cases of. Next, we derive the RRMSE of URGE. Theorem 4: The RRMSE of (26) is given by: 1 + l( 2)(1 ) ˆδ l = l (1 ) 2l+1 l 2 2 (1 (1 ) l. (31) ) It is easy to verify from (31) that URGE has zero RRMSE for l = 1 or l, confirming its elehant-accuracy. We lot ˆδ l obtained from s along with the in Fig. 6, which shows that (31) accurately tracks the actual relative error. From Figures 5-6, it is clear that ê(r l ) significantly imroves the accuracy of estimating small s comared to e(r l ). In ractice, (31) can be used to determine threshold l 0, which leads to desired bounds on error for all l l 0 and allows ISPs to use e(r l ) instead of l. B. Flow-Size Distribution It is worth mentioning that while (26) roduces unbiased estimation of s, ê(r L ) is not suitable for comuting the flow-size distribution, as we show below. Denote by ˆq i = P (ê(r L ) = i) the PMF of ê(r L ). Then, we have the following result. Lemma 4: PMF of ê(r L ) is given by: ˆq i = 1 s j=y(i) where s is in (9), function y(i) is: (1 ) j y(i) f j, (32) y(i) = i + 1 1/ ω, (33) and ω = W ( (1 ) i+1 1/ log(1 ) ). Proof: We first solve R L + 1/ 1 (1 ) R L / = i, (34) for R L and exress it in terms of i, i.e., R L = y(i), where y(i) is given by (33), ignoring aroximate round-offs to the nearest integer. Combining with (10), we have: ˆq i = P (R L = y(i)) = h y(i), (35) where h i is in (10). This directly leads to (32). Notice from (32)-(33) that distribution ˆq i does not even remotely aroximate the original PMF f i. This roblem is fundamental since residual samling exhibits bias towards larger flows and even if we could recover L from R L exactly, the distribution of samled s would not accurately aroximate that of all flows assing through the router. We thus exlore another technique for estimating the flowsize distribution. Before doing that, we need the next lemma. Lemma 5: The distribution f i can be exressed using the PMF of geometric residuals {h i } in (10) as: f i = h i (1 )h i+1 + (1 )h 1. (36) Proof: From (10), we obtain that: h i (1 )h i+1 = s f i. (37) It then immediately follows that f i is given by: f i = s(h i (1 )h i+1 ). (38) Notice that s in (9) is a function of {f i }, which are unknown from the measurement ersective. The last ste of the roof is to exress s in terms of known quantities {h i }, which can be accomlished by alying the normalization condition i=1 f i = 1. It is easy to verify that: h i = 1 and h i+1 = 1 h 1. (39) i=1 Then, summing u both sides of (38) for i from 1 to infinity gives us: s = i=1 (h i (1 )h i+1 ) =, (40) + (1 )h 1 i=1 which together with (38) establishes (36). This result leads to a new estimator for the flow-size distribution: q i = M i (1 )M i+1 M + (1 )M 1, (41)

7 (a) = 0.01, M = 194, 208 (b) = 0.001, M = 26, 233 (a) =, M = 3, 090 (b) = 10 5, M = 337 Fig. 7. Estimator (41) in s. Fig. 8. Estimator (41) in s with very small. where M is the total number of samled flows and M i is the number of them with the geometric residual equal to i. Since M i /M h i in robability as M (from the weak law of large numbers), we immediately get the following result. Corollary 2: The estimator in (41) is asymtotically unbiased. We next verify the accuracy of q i in s with 5M flows in the same setting as in the revious section. We lot in Fig. 7 the CCDF estimated from (41) along with the actual distribution. The figure shows that q i accurately follows the for both cases of. C. Convergence Seed We next examine the effect of samle size M on the convergence of estimator q i. To illustrate the roblems arising from small M, we study (41) with = and 10 5 in s with the same 5M flows. The estimator obtained M = 3, 090 flows for = and just M = 337 for = Fig. 8 indicates that while the estimated curves under both choices of still aroximate the trend of the original distribution, they exhibit different levels of noise. As the next result indicates, small leads to a small samle size M and thus more noise in the estimated values. Corollary 3: Suose that M flows are selected by residual-geometric samling from a total of n flows. Then, the exected value of M is given by: E[M] = n s = ne[1 (1 ) L ]. (42) Proof: This result follows from the fact that E[M] = np (A L < L) = n s, where s is given by (9). To shed light on the choice of roer for RGS, we show how to determine the minimum M that would guarantee a certain level of accuracy in q i. Define h i = M i /M to be an estimate of h i = P (R L = i). The next lemma follows from Lemma 5 and Corollary 2 and indicates that the accuracy of q i directly deends on whether h i aroximates h i accurately. Lemma 6: Suose that h j h j ηh j holds with robability 1 ξ for j [1, i+1], where η and ξ are small constants. Then, there exists a constant ζ: ζ = η( + 2η(1 )h 1) + (1 )(1 η)h 1 (43) such that ζ 0 as η 0 and P ( q i f i ζf i ) = 1 ξ. Proof: We rove the result by deriving ζ that satisfies q i f i ζf i given that h j h j ηh j. From (36) and (41), we have: where and q i f i = a 1 a 2, (44) a 1 = ( h i h i ) + (1 )( h i+1 h i+1 ) + (1 ) (h 1 hi h 1 h i ) + (1 ) 2 (h 1 hi+1 h 1 h i+1 ) (45) a 2 = ( + (1 )h 1 )( + (1 ) h 1 ). (46) From the condition h j h j ηh j, we bound a 1 and a 2 as follows: and a 1 ηh i + 2η(1 )h 1 h i + η(1 )h i+1 + 2η(1 ) 2 h 1 h i+1 = η(h i + (1 )h i+1 )( + 2η(1 )h 1 ), (47) a 2 ( + (1 )h 1 )( + (1 )(1 η)h 1 ). (48) It thus follows from (36) and (47)-(48) that q i f i ζf i, where constant ζ is given by: ζ = η( + 2η(1 )h 1) + (1 )(1 η)h 1, (49) and that ζ 0 as η 0. Next, we obtain a bound on M from the requirement that h i be bounded in robability within a given range [h i (1 η), h i (1 + η)]. Theorem 5: For small constants η and ξ, h i h i ηh i holds with robability 1 ξ if samle size M is no less than: M (1 h i) h i η 2 ( Φ 1 (1 ξ/2) ) 2, (50) where Φ(x) is the CDF of the standard Gaussian distribution N (0, 1). Proof: Notice that M i is a random variable whose distribution is given by Binomial(M, h i ) and that hi can be aroximated by a Gaussian random variable with mean µ i = h i and variance σi 2 = h i(1 h i )/M. Define Z = h i µ i σ i, (51) which is a standard Gaussian random variable with mean 0 and variance 1. It follows that: P ( Z z) = 2Φ(z) 1, (52)

8 8 where Φ(.) is the CDF function of the standard Gaussian distribution N (0, 1). Therefore, we establish that: P ( h i h i zσ i ) = 2Φ(z) 1. (53) We can guarantee target accuracy by setting zσ i = ηϕ i and 2Φ(z) 1 = 1 ξ, which gives the following equality: ηh i σ i = Φ 1 (1 ξ/2). (54) Substituting σ i = h i (1 h i )/M into the above equation and solving for M, we obtain (50). For examle, to bound h i within 10% ercent of h i (i.e., η = 0.1) with robability 1 ξ = 95% for all h i, the following must hold: M ( ) , (55) which indicates that M = 38K flows must be samled to achieve target accuracy. If we reduce η to 1%, increase 1 ξ to 99%, and require the aroximation to hold for all h i 10 3, then M must be at least 66M flows. Converting η into ζ using (43), one can establish similar bounds on the deviation of q i from f i. D. Estimation of Other Flow Metrics Besides s and the flow-size distribution, URGE also rovides estimators for the total number of flows and the number of them with size i. Before introducing these estimators, we need the next lemma. Lemma 7: The exected number of flows with samled residuals R L = i is: E[M i ] = E[M]h i = nh i s, (56) where h i is the PMF of geometric residuals and s is given by (9). Proof: Writing: E[M i ] = np (A L < L R L = i) = np (R L = i A L < L)P (A L < L), (57) notice that (56) follows from the fact that P (R L = i A L < L) = h i and P (A L < L) = s. Based on this, we next develo two estimators and rove their accuracy. Let ñ be an estimator of the total number of flows n observed in the measurement window [0, T ]: ñ = M + 1 M 1 (58) and ñ i be an estimator of the number of flows n i with size i: ñ i = M i (1 )M i+1. (59) Then, the next result shows that both of these estimators are asymtotically unbiased. Lemma 8: Ratios ñ/n and ñ i /n i, for all i such that f i > 0, converge to 1 in robability as M. Proof: To rove convergence in robability, it suffices to show that E[ñ/n] = 1 and V ar[ñ/n] 0 as n. From (58), we have: E[ñ] = E[M] + 1 E[M 1]. (60) Alying (42) and (56), we get: E[ñ] = n s ( 1 + (1 )h 1 ), (61) which simlifies to E[ñ] = n using (40). To tackle the variance of ñ/n, first notice that M can be reresented as a sum of n i.i.d. Bernoulli variables (i.e., M = n j=1 A j), each with fixed robability s. Therefore: [ M V ar n ] = 1 n 2 n j=1 V ar[a j ] = s(1 s ), (62) n where the last term is bounded by 1/n. Alying similar reasoning to M 1, we obtain that V ar[ñ/n] 1/n. Since we assumed that the number of samled flows M, this imlies that n s and thus from (40) that n, which establishes that V ar[ñ/n] 0. Convergence in robability immediately follows (in fact, an even stronger convergence in mean-square holds, but this distinction is not essential in our context). For the second art of the theorem, define X n = ñ i /n and Y n = n i /n. We first rove that both X n and Y n converge in robability to f i. We then argue that their ratio X n /Y n converges to 1, also in robability. Using (56), (40), and finally (36), we have: E[X n ] = E[M i] (1 )E[M i+1 ] n = s(h i (1 )h i+1 ) = h i (1 )h i+1 + (1 )h 1 = f i. (63) Since n i is the number of flows with size i, its exectation is E[n i ] = np (L = i) = nf i and thus E[Y n ] = f i. Using reasoning similar to that in the first half of this roof, we obtain that V ar[x n ] 0 and V ar[y n ] 0, which shows convergence of these variables to f i in robability. For the final ste, consider two sequences {X n } and {Y n } that converge to the same ositive constant f i > 0. Then, simle maniulation shows that their ratio converges to 1 in robability. We leave details to the reader. Note that [17] rovided a similar estimator as (58) and roved E[ñ] = n using a different aroach from ours; however, our results are stronger as they show convergence in robability and additionally address estimation of n i. Simulations verifying (58)-(59) are omitted for brevity. VI. IMPLEMENTATION In this section, we imlement URGE and examine its memory consumtion and rocessing seed.

9 9 ackets Processes Flow Classification Residual Samling Residual Estimation estimation flow ID + counter memory ointer Flow Counter Table K 1 Memory Fig. 10. Illustration of a chained hash table for maintaining flow counters. Fig. 9. The URGE framework. A. General Structure Fig. 9 illustrates a framework that imlements the various URGE algorithms. This framework contains three rocesses flow classification, residual-geometric samling, and estimation as well as one data structure containing the flow counter table. Flow classification rocesses each incoming acket for flow ID and then forwards it to residual-geometric samling. For each flow ID x arriving from flow classification, residualgeometric samling first checks if the flow table has an existing entry for x and increment the counter by 1; if an entry does not exist, it is created with robability and its counter is initialized to 1. The geometric estimation rocess collects counter values from the flow table and then uses URGE to estimate flow statistics. The flow table kees a maing between flow IDs and associated counters. The table suorts three oerations: 1) looku(x) to retrieve the record of flow x; 2) add(x) to insert a new entry for flow x in the table with the initial counter value 1; and 3) increment(x) to add 1 to the counter of flow x. We dislay in Fig. 10 an imlementation of the counter table, which is based on a chained hash table. Assume a hash function hash(x) that roduces an integer value in [0, 1,..., K 1]. We assume that the generated hash values are uniformly distributed within interval [0, K 1] and the imlementation of function hash(.) is fast enough. Efficient hardware hash functions can be found in [29]. We maintain an array A of size K and each entry A[k] oints to a liked list that kees the set of flows whose IDs have the same hash value k. Each node in the list contains two fields: 1) flow data that kee the flow ID, the acket counter, and the timestam of the last acket; and 2) a ointer to the next node. An imortant element of our algorithm is to ensure that the table kees only active flows, which is accomlished by eriodic swees through the table and removal of all flows that have comleted using FIN/RST ackets or have been idle for longer than τ time units. Uon removal, flow information is saved to disk (single-flow usage) or aggregated into a RAM-based PMF table (flow-distribution usage). Oerations add(x) and increment(x) automatically modify the timestams associated with each flow and allow timeout-based exulsion of dead flows. Notice that the flow table is accessed by residual-geometric samling uon each acket arrival. Therefore, the scalability of the measurement algorithm essentially deends on the access seed to the table. In what follows, we analyze the design of the flow table and quantify its two imortant roerties: memory consumtion and rocessing seed. B. Active Flows To understand how much benefit removal of dead flows rovides to memory consumtion, we next derive the exected number of active flows at any time t and their fraction samled by the algorithm. Assume a measurement window [0, T ], where T is given in ackets seen by the router. For each flow j, let inter-acket delays within the flow be given by a random variable j, which counts the number of acket arrivals from other flows between adjacent ackets of j. Denoting by = E[ j ], we have the following result. Lemma 9: Assuming stationary flow arrivals in [0, T ] and T, the exected number of active flows N(t) at time t is given by: E[N(t)] = + 1. (64) Proof: Reresent N(t) = n j=1 A j(t) as the sum of n indicator variables, where A j (t) is 1 if flow j is alive at t and 0 otherwise. Observe that: E[N(t)] = ne[a j (t)] = np (A j (t) = 1) (65) and notice that each flow exists at the router for L k=1 ( k j + 1) acket units, where 1 j, 2 j,... are i.i.d. instances of variable j. Then, the robability that t [0, T ] lands within a given flow is simly the flow s exected footrint (in ackets seen by the router) normalized by the window size: E[A j (t)] = 1 [ L T E ] ( k j + 1). (66) k=1 Using Wald s equation, this simlifies to E[A j (t)] = E[L]/T. Finally, since ne[l]/t = 1, we immediately obtain (64) using (65). Our baseline reduction in flow volume comes from geometric samling in revious sections and reduces the number of flows by a factor of r 1 = n/e[m]. Now additionally define ratio r 2 = n/e[n(t)] = T/( + 1)E[L] and observe that longer observation windows (i.e., larger T ), smaller flow sizes (i.e., smaller E[L]), and denser arrivals (i.e., smaller ) imly more savings of memory. In fact, T results in r 2 if the other arameters are fixed. However, even more reduction is ossible by discarding dead flows in RGS. Denote by M(t) the number of samled flows that are still alive at t and consider the next result.

10 TABLE I COMPARISON OF (64) AND (67) TO SIMULATION RESULTS N(t) time t (a) alive flows Fig. 11. Verifying s (64) and (67). M(t) time t (b) samled flows Lemma 10: Assuming the flow arrival rocess is stationary in [0, T ] and T, the exected number of active samled flows at time t is given by: ( E[M(t)] = ( + 1) 1 1 E[L] s ), (67) where s in (9) is the fraction of all flows samled by RGS. Proof: Following Lemma 9, it suffices to derive the average acket footrint of flow j within window [0, T ]. Dividing this footrint by T gives us the robability that current time t falls within the residual of the flow and multilying the result by n roduces the exected number of flows stored in RAM. Condition on L = l and define P l as the number of ackets counted by RGS from flow i: { R l flow samled P l =. (68) 0 otherwise Then, the flow s footrint F l is: P l F l = ( k j + 1), (69) k=1 where as before k j are i.i.d. inter-acket delays induced by cross-traffic that do not deend on the size of flow j. Next, taking the exectation of F l, we have: E[F l ] = E[P l ]( + 1) = E[R l samled]p (samled)( + 1). (70) Using (14) and recalling that P (samled) = 1 (1 ) l, we have: [ E[F l ] = ( + 1) l 1 ] (1 (1 )l ). (71) Unconditioning L = l, we have the exected footrint as: [ E[F ] = ( + 1) E[L] 1 ] s, (72) where s = E[1 (1 ) L ] is the robability that a flow is samled by RGS. Multilying E[F ] by n, dividing by T, and taking E[L] outside, we get (67). Define r 3 = n/e[m(t)] as the exected reduction of sace when tracking only active RGS flows comared to all seen flows at the router and notice that this ratio increases not only as T grows, but also when decreases. Performing a selfcheck using Jensen s inequality, observe that 0 s /E[L] 1 and therefore E[M(t)] E[N(t)], which means that the former indeed always results in more reduction in table size. time t E[N(t)] E[M(t)] (64) (67) We discuss numerical values of r 1 r 3 in the next section and now focus on the accuracy of the obtained results. We evaluate s (64) and (67) in s with 1, 000 iterations through window [0, T ] with randomly generated flows from the a distribution with flow-size CDF F i = 1 i α, where α = 1.1 and = Fig. 11 lots the evolution of N(t) and M(t) along with the exected values comuted from the s. Table I comares the s with E[N(t)] and E[M(t)] comuted in s, where each value is averaged using the same 1, 000 iterations of the traffic stream. Both indicate a very close match. C. Memory Consumtion The memory used by the flow table can be divided into two arts: one for the hash table, which contains an array of ointers, and the other for flow records, which are organized in a set of linked lists. Define w to be the number of bytes used by each memory ointer and w f to be that needed for flow counter, timestam, and flow ID. Then, the following theorem gives the memory required for the measurement algorithm. Theorem 6: The average number of bytes required by URGE in steady-state is: E[W R (t)] = Kw + E[M(t)](w c + w f ), (73) where E[M(t)] is the average number of samled active flows at time t given by (67). From (73), observe that for n original flows with a given distribution of L, memory consumtion E[W R (t)] can be reduced by lowering either M(t) or K. As discussed in the revious section, M(t) cannot be arbitrarily small as it would lead to lower accuracy. At the same time, small K leads to more conflicts in the hash table, longer linked lists, and thus may slow down the samling rocess, which are the issues we study next. D. Processing Time The time sent in rocessing each acket deends on how linked lists are built. We examine an aroach that sorts flow entries of each linked list based on flow IDs. In this aroach, function looku(x) returns a ointer to the entry of flow x if it exists in the table; otherwise, the function returns a ointer to where the new entry should be inserted. For each acket with flow ID x, we erform the following stes in sequential order: 1) comute the k = hash(x); 2) retrieve the linked-list head ointer A[k] from the hash table; 3) iterate through the linked list until a flow record is matched or a flow with ID larger than x is reached; 4) if x is not found, a

11 11 TABLE II CONSTANTS USED IN (73) AND (74) 4 x K l K u RAM constant value CPU constant value w 4B t h 12ns w f 17B t 9ns W MB t c 3ns T 0 24ns table size limits K l & K u = K 0 = exected E[W R ] (MB) K u = table size K x 10 5 (a) E[W R (t)] exected E[T R ] (ns) K l = table size K x 10 5 (b) E[T R (t)] Fig. 12. Tradeoff: (a) memory consumtion and (b) rocessing time with E[M(t)] = Gray areas dislay the accetable ranges of K. new entry for x is created with robability and inserted to the location returned by looku(x). Notice that the fourth ste is executed only when a new flow arrives and is samled, which is much less frequent comared to the case of an existing flow. Thus, consider its contribution to the overall overhead negligible and omit it from analysis. Denote by t h the time sent in comuting a hash, by t that of memory access, and by t c that of each comarison of flow IDs. Define T R (t) to be the rocessing delay/latency of each incoming acket at time t. Then, noticing that the exected list length is E[M(t)]/K entries and on average traversal stos in the middle of a list, we have the next result. Theorem 7: The exected er-acket rocessing time is: E[T R (t)] = t h + t + (t c + t ) E[M(t)] 2K. (74) The result in (74) indicates that both large hash table size K and small samle size M(t) can contribute to a faster samling rocess. Since larger K reduces (74), but increases (73), we next examine how to roerly select K and to simultaneously satisfy certain target constraints on E[W R (t)] and E[T R (t)] given their conflicting deendency on K. E. Tradeoff Analysis Now, we are ready to exlore the design sace of constants (K, ) to strike a balance between accuracy and scalability. Suose that a router requires that E[W R (t)] W 0 and E[T R (t)] T 0. Further assume that the number of samled flows E[M(t)] is known and fixed (i.e., fixed, window T, and flow-size distribution). Define two constants: and K l = (t c + t )E[M(t)] 2(T 0 (t h + t )), (75) K u = W 0 E[M(t)](w c + w f ) w. (76) Assuming K l K u, it then follows from (73) and (74) that one can choose any value K [K l, K u ] to satisfy the two samling robability Fig. 13. Lower and uer bounds on table size K with varying robability. Gray areas dislay the accetable range of K and. constraints on memory and seed. We show below how to vary in order to maximize accuracy while ensuring K l K u. To understand this better, consider the following examle. Assume that the original traffic contains n = 10 6 flows with a ower-law distribution P (L i) = 1 i 1.1. With = 0.01, residual-geometric samling obtains E[M(t)] = samled flows. Table II gives the constants we use to comute the exected memory consumtion and rocessing time in (73) and (74). We also imose the following constraints on memory and delay: W 0 = 1.65MB and T 0 = 24ns. 1 Fig. 12 illustrates the accetable ranges of table size K derived from the s. The figure indicates that table size K can be any value between K l = and K u = to simultaneously satisfy both requirements W 0 and T 0. Note that for some values of E[M(t)] it is ossible that K l is larger than K u and thus the constraints cannot be met. Therefore, we next vary to show how the choice of K will be affected. Fig. 13 lots K u and K l as functions of, where both curves are obtained from the corresonding s. Notice from the figure that K l monotonically increases and K u monotonically decreases in. This imlies that interval [K l, K u ] eventually shrinks to a single oint K 0, after which no feasible assignment of table size K exists. Since larger imlies more accurate estimation (i.e., the router sees more flows M in the interval [0, T ] and thus estimates distribution {h i } more accurately), it is desirable to select the maximum that allows the router to satisfy the sace and seed constraints. This occurs in a single otimum oint 0 that corresonds to K l = K u = K 0. In our examle, we get 0 = and K 0 = VII. PERFORMANCE EVALUATION In this section, we evaluate our s using several Internet traces in Table III from NLANR [24] and CAIDA [3]. Trace FRG was collected from a gigabit link between UCSD and Abilene in We extracted from it additional traces with only Web, DNS, and NTP flows (also seen in the table). Additionally, we use three traces from CAIDA: LARGE a one-hour trace from an OC48 link, MEDIUM a one-minute trace from a OC192 link, and SMALL a 7-minute trace from a gigabit link. As the table shows, URGE tyically sees a reasonably large number of flows M over the entire interval [0, T ]; however, 1 These values allow to hold about 10 5 flow records (each with a flow ID and a counter) and rocess 1-Kbit ackets at OC-768 rates (i.e., 40 Gbs).

12 12 TABLE III REDUCTION IN THE NUMBER OF FLOWS USING RESIDUAL SAMPLING WITH = 0.01 AND DIFFERENT TYPES OF PERIODIC REMOVAL OF DEAD FLOWS source trace total flows n total kts ne[l] samling only removal only both E[M] r 1 E[N(t)] r 2 E[M(t)] r 3 FRG 1, 756, , 821, , , , NLANR Web 239, 174 6, 497, , , DNS 120, , 977 2, , 797 NTP 382, , 447 4, , , 887 LARGE 9, 653, , 250, , , , CAIDA MEDIUM 2, 317, , 837, , , , SMALL 200, 9, 179, , , , TABLE IV PERFORMANCE OF URGE WITH = AND HASH TABLE SIZE K = E[M(t)] source trace E[W R (t)] E[T R (t)] # of flows # of size-one flows actual (n) estimated (ñ) error actual (n 1 ) estimated (ñ 1 ) error FRG 31KB 24.1ns 1, 756, 702 1, 736, % 768, , % NLANR Web 10KB 21.4ns 239, , % 13, , % DNS 257B 21ns 120, , % 76, , % NTP 752B 21.1ns 382, , % 281, , % LARGE 132KB 28.1ns 9, 653, 609 9, 717, % 4, 535, 449 4, 630, % CAIDA MEDIUM 341KB 23.7ns 2, 317, 369 2, 278, % 1, 299, 343 1, 273, % SMALL 23KB 21.2ns 200, 902, % 93, , % estimated size 10 4 actual estimated (a) E[e(R l )] estimated size 10 4 actual estimated (b) E[ê(R l )] Fig. 14. Estimating single-flow usage in the FRG trace with = relative RMSE (a) δ l (b) ˆδ l Fig. 15. RRMSE of single-flow usage in the FRG trace with = relative RMSE the number of active flows N(t) and those constantly ket in memory M(t) is much smaller. For the FRG trace, for examle, E[M] is 15 times smaller than n, while E[N(t)] is 81 and E[M(t)] is 658 times smaller. In general, NLANR traces benefit more from the removal of dead flows than CAIDA data, because former was collected over two consecutive days and thus had a larger observation window T, which led to larger ratios r 2 and r 3. The same reasoning also exlains the fact that the LARGE trace exhibits much higher benefit from removing dead flows than MEDIUM or SMALL traces. A. Memory and Seed We use the settings of Table II to comute the amount of memory consumed by URGE according to (73). As shown in the third column of Table IV for = and K = E[M(t)], the required memory size is small and rarely exceeds 40 KB. Even for the LARGE trace that has the most flows in this comarison, URGE only needs 132 KB of RAM, much smaller than roughly 120 MB required for keeing all flow counters. We also comute er-acket rocessing time from (74) based on Table II and show in the fourth column of Table IV that E[T R (t)] 25 ns in the majority of the studied cases. B. Estimation Accuracy First, we examine the roblem of estimating the total number of flows n in [0, T ] and size-one flows n 1 in this interval. The seventh and tenth columns of Table IV list the absolute error of s (58) and (59), resectively. With the excetion of the Web NLANR trace, these estimates are within aroximately 3% of the correct value. We next evaluate the erformance of URGE in estimating single-flow usage. Fig. 14 lots the exectation of estimated s (averaged over 100 iterations) along with the actual values obtained from the FRG trace using = The figure shows that the estimator e(r l ) from revious work tends to overestimate the sizes of small flows, while URGE s estimator ê(r l ) accurately follows the actual values. We also comare the relative errors of the two studied methods in Fig. 15, which indicates that URGE has RRMSE bounded by 1 for all flows, while e(r l ) exhibits very large δ l for small and medium flows, which is an increasing function of 1/. For the flow-size distribution, we first examine three values of to comare its effect on the accuracy of URGE in the FRG trace. Fig. 16 indicates that estimation for all three values of are very consistent and all of them follow the accurately. In our exeriments with = , URGE recovered the original PMF {f i } using only M = 7, 616 total flows out of n = 1.75M.

Modeling Residual-Geometric Flow Sampling

Modeling Residual-Geometric Flow Sampling Modeling Residual-Geometric Flow Samling Xiaoming Wang Amazon.com Seattle, WA 98101 USA Email: xmwang@gmail.com Xiaoyong Li Texas A&M University College Station, TX 77843 USA Email: xiaoyong@cse.tamu.edu

More information

Modeling Residual-Geometric Flow Sampling

Modeling Residual-Geometric Flow Sampling Modeling Residual-Geometric Flow Sampling Xiaoming Wang Joint work with Xiaoyong Li and Dmitri Loguinov Amazon.com Inc., Seattle, WA April 13 th, 2011 1 Agenda Introduction Underlying model of residual

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

An Analysis of TCP over Random Access Satellite Links

An Analysis of TCP over Random Access Satellite Links An Analysis of over Random Access Satellite Links Chunmei Liu and Eytan Modiano Massachusetts Institute of Technology Cambridge, MA 0239 Email: mayliu, modiano@mit.edu Abstract This aer analyzes the erformance

More information

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

Computer arithmetic. Intensive Computation. Annalisa Massini 2017/2018

Computer arithmetic. Intensive Computation. Annalisa Massini 2017/2018 Comuter arithmetic Intensive Comutation Annalisa Massini 7/8 Intensive Comutation - 7/8 References Comuter Architecture - A Quantitative Aroach Hennessy Patterson Aendix J Intensive Comutation - 7/8 3

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

On the Relationship Between Packet Size and Router Performance for Heavy-Tailed Traffic 1

On the Relationship Between Packet Size and Router Performance for Heavy-Tailed Traffic 1 On the Relationshi Between Packet Size and Router Performance for Heavy-Tailed Traffic 1 Imad Antonios antoniosi1@southernct.edu CS Deartment MO117 Southern Connecticut State University 501 Crescent St.

More information

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Supplementary Materials for Robust Estimation of the False Discovery Rate

Supplementary Materials for Robust Estimation of the False Discovery Rate Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides

More information

Characterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations

Characterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations Characterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations PINAR KORKMAZ, BILGE E. S. AKGUL and KRISHNA V. PALEM Georgia Institute of

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

Distributed Rule-Based Inference in the Presence of Redundant Information

Distributed Rule-Based Inference in the Presence of Redundant Information istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced

More information

8 STOCHASTIC PROCESSES

8 STOCHASTIC PROCESSES 8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular

More information

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment

More information

Plotting the Wilson distribution

Plotting the Wilson distribution , Survey of English Usage, University College London Setember 018 1 1. Introduction We have discussed the Wilson score interval at length elsewhere (Wallis 013a, b). Given an observed Binomial roortion

More information

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität

More information

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points. Solved Problems Solved Problems P Solve the three simle classification roblems shown in Figure P by drawing a decision boundary Find weight and bias values that result in single-neuron ercetrons with the

More information

Fisher Information in Flow Size Distribution Estimation

Fisher Information in Flow Size Distribution Estimation 1 Fisher Information in Flow Size Distribution Estimation Paul Tune, Member, IEEE, and Darryl Veitch, Fellow, IEEE Abstract The flow size distribution is a useful metric for traffic modeling and management

More information

Positive decomposition of transfer functions with multiple poles

Positive decomposition of transfer functions with multiple poles Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.

More information

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

Brownian Motion and Random Prime Factorization

Brownian Motion and Random Prime Factorization Brownian Motion and Random Prime Factorization Kendrick Tang June 4, 202 Contents Introduction 2 2 Brownian Motion 2 2. Develoing Brownian Motion.................... 2 2.. Measure Saces and Borel Sigma-Algebras.........

More information

Multi-Operation Multi-Machine Scheduling

Multi-Operation Multi-Machine Scheduling Multi-Oeration Multi-Machine Scheduling Weizhen Mao he College of William and Mary, Williamsburg VA 3185, USA Abstract. In the multi-oeration scheduling that arises in industrial engineering, each job

More information

Improved Capacity Bounds for the Binary Energy Harvesting Channel

Improved Capacity Bounds for the Binary Energy Harvesting Channel Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,

More information

Analysis of Multi-Hop Emergency Message Propagation in Vehicular Ad Hoc Networks

Analysis of Multi-Hop Emergency Message Propagation in Vehicular Ad Hoc Networks Analysis of Multi-Ho Emergency Message Proagation in Vehicular Ad Hoc Networks ABSTRACT Vehicular Ad Hoc Networks (VANETs) are attracting the attention of researchers, industry, and governments for their

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

Analysis of execution time for parallel algorithm to dertmine if it is worth the effort to code and debug in parallel

Analysis of execution time for parallel algorithm to dertmine if it is worth the effort to code and debug in parallel Performance Analysis Introduction Analysis of execution time for arallel algorithm to dertmine if it is worth the effort to code and debug in arallel Understanding barriers to high erformance and redict

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education CERIAS Tech Reort 2010-01 The eriod of the Bell numbers modulo a rime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education and Research Information Assurance and Security Purdue University,

More information

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

John Weatherwax. Analysis of Parallel Depth First Search Algorithms Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel

More information

An Analysis of Reliable Classifiers through ROC Isometrics

An Analysis of Reliable Classifiers through ROC Isometrics An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Age of Information: Whittle Index for Scheduling Stochastic Arrivals

Age of Information: Whittle Index for Scheduling Stochastic Arrivals Age of Information: Whittle Index for Scheduling Stochastic Arrivals Yu-Pin Hsu Deartment of Communication Engineering National Taiei University yuinhsu@mail.ntu.edu.tw arxiv:80.03422v2 [math.oc] 7 Ar

More information

1 Gambler s Ruin Problem

1 Gambler s Ruin Problem Coyright c 2017 by Karl Sigman 1 Gambler s Ruin Problem Let N 2 be an integer and let 1 i N 1. Consider a gambler who starts with an initial fortune of $i and then on each successive gamble either wins

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Cryptanalysis of Pseudorandom Generators

Cryptanalysis of Pseudorandom Generators CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

Probability Estimates for Multi-class Classification by Pairwise Coupling

Probability Estimates for Multi-class Classification by Pairwise Coupling Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics

More information

New Schedulability Test Conditions for Non-preemptive Scheduling on Multiprocessor Platforms

New Schedulability Test Conditions for Non-preemptive Scheduling on Multiprocessor Platforms New Schedulability Test Conditions for Non-reemtive Scheduling on Multirocessor Platforms Technical Reort May 2008 Nan Guan 1, Wang Yi 2, Zonghua Gu 3 and Ge Yu 1 1 Northeastern University, Shenyang, China

More information

HENSEL S LEMMA KEITH CONRAD

HENSEL S LEMMA KEITH CONRAD HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar

More information

Unit 1 - Computer Arithmetic

Unit 1 - Computer Arithmetic FIXD-POINT (FX) ARITHMTIC Unit 1 - Comuter Arithmetic INTGR NUMBRS n bit number: b n 1 b n 2 b 0 Decimal Value Range of values UNSIGND n 1 SIGND D = b i 2 i D = 2 n 1 b n 1 + b i 2 i n 2 i=0 i=0 [0, 2

More information

A Study of Active Queue Management for Congestion Control

A Study of Active Queue Management for Congestion Control A Study of Active Queue Management for Congestion Control Victor Firoiu vfiroiu@nortelnetworks.com Nortel Networks 3 Federal St. illerica, MA 1821 USA Marty orden mborden@tollbridgetech.com Tollridge Technologies

More information

A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION

A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION O P E R A T I O N S R E S E A R C H A N D D E C I S I O N S No. 27 DOI:.5277/ord73 Nasrullah KHAN Muhammad ASLAM 2 Kyung-Jun KIM 3 Chi-Hyuck JUN 4 A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST

More information

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &

More information

Named Entity Recognition using Maximum Entropy Model SEEM5680

Named Entity Recognition using Maximum Entropy Model SEEM5680 Named Entity Recognition using Maximum Entroy Model SEEM5680 Named Entity Recognition System Named Entity Recognition (NER): Identifying certain hrases/word sequences in a free text. Generally it involves

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test) Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant

More information

Robust Lifetime Measurement in Large- Scale P2P Systems with Non-Stationary Arrivals

Robust Lifetime Measurement in Large- Scale P2P Systems with Non-Stationary Arrivals Robust Lifetime Measurement in Large- Scale P2P Systems with Non-Stationary Arrivals Xiaoming Wang Joint work with Zhongmei Yao, Yueping Zhang, and Dmitri Loguinov Internet Research Lab Computer Science

More information

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:

More information

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar 15-859(M): Randomized Algorithms Lecturer: Anuam Guta Toic: Lower Bounds on Randomized Algorithms Date: Setember 22, 2004 Scribe: Srinath Sridhar 4.1 Introduction In this lecture, we will first consider

More information

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation Paer C Exact Volume Balance Versus Exact Mass Balance in Comositional Reservoir Simulation Submitted to Comutational Geosciences, December 2005. Exact Volume Balance Versus Exact Mass Balance in Comositional

More information

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment

More information

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty How to Estimate Exected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty Christian Servin Information Technology Deartment El Paso Community College El Paso, TX 7995, USA cservin@gmail.com

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

Developing A Deterioration Probabilistic Model for Rail Wear

Developing A Deterioration Probabilistic Model for Rail Wear International Journal of Traffic and Transortation Engineering 2012, 1(2): 13-18 DOI: 10.5923/j.ijtte.20120102.02 Develoing A Deterioration Probabilistic Model for Rail Wear Jabbar-Ali Zakeri *, Shahrbanoo

More information

Adaptive estimation with change detection for streaming data

Adaptive estimation with change detection for streaming data Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment

More information

ECE 534 Information Theory - Midterm 2

ECE 534 Information Theory - Midterm 2 ECE 534 Information Theory - Midterm Nov.4, 009. 3:30-4:45 in LH03. You will be given the full class time: 75 minutes. Use it wisely! Many of the roblems have short answers; try to find shortcuts. You

More information

1 Probability Spaces and Random Variables

1 Probability Spaces and Random Variables 1 Probability Saces and Random Variables 1.1 Probability saces Ω: samle sace consisting of elementary events (or samle oints). F : the set of events P: robability 1.2 Kolmogorov s axioms Definition 1.2.1

More information

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi

More information

Online Appendix to Accompany AComparisonof Traditional and Open-Access Appointment Scheduling Policies

Online Appendix to Accompany AComparisonof Traditional and Open-Access Appointment Scheduling Policies Online Aendix to Accomany AComarisonof Traditional and Oen-Access Aointment Scheduling Policies Lawrence W. Robinson Johnson Graduate School of Management Cornell University Ithaca, NY 14853-6201 lwr2@cornell.edu

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

1-way quantum finite automata: strengths, weaknesses and generalizations

1-way quantum finite automata: strengths, weaknesses and generalizations 1-way quantum finite automata: strengths, weaknesses and generalizations arxiv:quant-h/9802062v3 30 Se 1998 Andris Ambainis UC Berkeley Abstract Rūsiņš Freivalds University of Latvia We study 1-way quantum

More information

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation 6.3 Modeling and Estimation of Full-Chi Leaage Current Considering Within-Die Correlation Khaled R. eloue, Navid Azizi, Farid N. Najm Deartment of ECE, University of Toronto,Toronto, Ontario, Canada {haled,nazizi,najm}@eecg.utoronto.ca

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

Estimating Time-Series Models

Estimating Time-Series Models Estimating ime-series Models he Box-Jenkins methodology for tting a model to a scalar time series fx t g consists of ve stes:. Decide on the order of di erencing d that is needed to roduce a stationary

More information

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1) CERTAIN CLASSES OF FINITE SUMS THAT INVOLVE GENERALIZED FIBONACCI AND LUCAS NUMBERS The beautiful identity R.S. Melham Deartment of Mathematical Sciences, University of Technology, Sydney PO Box 23, Broadway,

More information

Stochastic integration II: the Itô integral

Stochastic integration II: the Itô integral 13 Stochastic integration II: the Itô integral We have seen in Lecture 6 how to integrate functions Φ : (, ) L (H, E) with resect to an H-cylindrical Brownian motion W H. In this lecture we address the

More information

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling Scaling Multile Point Statistics or Non-Stationary Geostatistical Modeling Julián M. Ortiz, Steven Lyster and Clayton V. Deutsch Centre or Comutational Geostatistics Deartment o Civil & Environmental Engineering

More information

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund

More information

Session 5: Review of Classical Astrodynamics

Session 5: Review of Classical Astrodynamics Session 5: Review of Classical Astrodynamics In revious lectures we described in detail the rocess to find the otimal secific imulse for a articular situation. Among the mission requirements that serve

More information

rate~ If no additional source of holes were present, the excess

rate~ If no additional source of holes were present, the excess DIFFUSION OF CARRIERS Diffusion currents are resent in semiconductor devices which generate a satially non-uniform distribution of carriers. The most imortant examles are the -n junction and the biolar

More information

Signaled Queueing. Laura Brink, Robert Shorten, Jia Yuan Yu ABSTRACT. Categories and Subject Descriptors. General Terms. Keywords

Signaled Queueing. Laura Brink, Robert Shorten, Jia Yuan Yu ABSTRACT. Categories and Subject Descriptors. General Terms. Keywords Signaled Queueing Laura Brink, Robert Shorten, Jia Yuan Yu ABSTRACT Burstiness in queues where customers arrive indeendently leads to rush eriods when wait times are long. We roose a simle signaling scheme

More information

Elliptic Curves and Cryptography

Elliptic Curves and Cryptography Ellitic Curves and Crytograhy Background in Ellitic Curves We'll now turn to the fascinating theory of ellitic curves. For simlicity, we'll restrict our discussion to ellitic curves over Z, where is a

More information

substantial literature on emirical likelihood indicating that it is widely viewed as a desirable and natural aroach to statistical inference in a vari

substantial literature on emirical likelihood indicating that it is widely viewed as a desirable and natural aroach to statistical inference in a vari Condence tubes for multile quantile lots via emirical likelihood John H.J. Einmahl Eindhoven University of Technology Ian W. McKeague Florida State University May 7, 998 Abstract The nonarametric emirical

More information

A Simple Throughput Model for TCP Veno

A Simple Throughput Model for TCP Veno A Simle Throughut Model for TCP Veno Bin Zhou, Cheng Peng Fu, Dah-Ming Chiu, Chiew Tong Lau, and Lek Heng Ngoh School of Comuter Engineering, Nanyang Technological University, Singaore 639798 Email: {zhou00,

More information

Metrics Performance Evaluation: Application to Face Recognition

Metrics Performance Evaluation: Application to Face Recognition Metrics Performance Evaluation: Alication to Face Recognition Naser Zaeri, Abeer AlSadeq, and Abdallah Cherri Electrical Engineering Det., Kuwait University, P.O. Box 5969, Safat 6, Kuwait {zaery, abeer,

More information

arxiv:cond-mat/ v2 25 Sep 2002

arxiv:cond-mat/ v2 25 Sep 2002 Energy fluctuations at the multicritical oint in two-dimensional sin glasses arxiv:cond-mat/0207694 v2 25 Se 2002 1. Introduction Hidetoshi Nishimori, Cyril Falvo and Yukiyasu Ozeki Deartment of Physics,

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

t 0 Xt sup X t p c p inf t 0

t 0 Xt sup X t p c p inf t 0 SHARP MAXIMAL L -ESTIMATES FOR MARTINGALES RODRIGO BAÑUELOS AND ADAM OSȨKOWSKI ABSTRACT. Let X be a suermartingale starting from 0 which has only nonnegative jums. For each 0 < < we determine the best

More information

COMMUNICATION BETWEEN SHAREHOLDERS 1

COMMUNICATION BETWEEN SHAREHOLDERS 1 COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov

More information