Modeling Residual-Geometric Flow Sampling
|
|
- Tyrone Wilcox
- 6 years ago
- Views:
Transcription
1 Modeling Residual-Geometric Flow Samling Xiaoming Wang Amazon.com Seattle, WA USA Xiaoyong Li Texas A&M University College Station, TX USA Dmitri Loguinov Texas A&M University College Station, TX USA Abstract Traffic monitoring and estimation of flow arameters in high seed routers have recently become challenging as the Internet grew in both scale and comlexity. In this aer, we focus on a family of flow-size estimation algorithms we call Residual- Geometric Samling (RGS), which generates a random oint within each flow according to a geometric random variable and records all remaining ackets in a flow counter. Our analytical investigation shows that revious estimation algorithms based on this method exhibit certain bias in recovering flow statistics from the samled measurements. To address this roblem, we derive a novel set of unbiased estimators for RGS, validate them using real Internet traces, and show that they rovide an accurate and scalable solution to Internet traffic monitoring. I. INTRODUCTION Recent growth of the Internet in both scale and comlexity has imosed a number of challenges on network management, oeration, and traffic monitoring. The main roblem in this line of work is to scale measurement algorithms to achieve certain objectives (e.g., accuracy) while satisfying real-time resource constraints (e.g., fixed memory consumtion and er-acket rocessing delay) of high-seed Internet routers. This is commonly accomlished (e.g., [5], [6], [7], [8], [9], [10], [11], [14], [15], [17], [21], [18], [19], [20], [22], [25], [30]) by reducing the amount of information a router has to store in its internal tables, which comes at the exense of deloying secial estimation techniques that can recover metrics of interest from the collected samles. In this aer, we study two roblems in the general area of measuring s 1) determining the number of ackets transmitted by elehant flows [11], [15], [17], [21], [20], [22] and 2) building the distribution of s seen by the router in some time window [7], [18], [30] couled in a single measurement technique. The former roblem arises in usage-based accounting and traffic engineering [6], [11], [12], [13], [26], while the latter has many security alications such as anomaly and intrusion detection [1], [23], [16]. Our interest falls within the family of residual samling, which selects a random oint A within each flow and then samles the remainder R of that flow until it ends. Denoting by L the size (in ackets) of a random flow, samled residuals R are simly L A. Stochastically larger A results in fewer flows being samled and leads to lower overhead in terms of both CPU and RAM consumtion. Besides reduced overhead arising from omission of many small-size flows from counter Suorted by NSF grants CNS and CNS tables, residual samling guarantees to cature large flows with robability 1 o(1) as their size L. This allows ISPs to determine heavy-hitters and charge the corresonding customers for generated traffic. While in P2P networks residual samling distributes the initial oint A uniformly within user lifetimes [29], flow-based estimation [11], [17] usually emloys geometric A since it can be easily imlemented with a sequence of indeendent Bernoulli variables. We call the resulting aroach Residual- Geometric Samling (RGS) and note that it has received some limited analytical attention in [11], [17]; however, unbiased estimation of individual s, analysis of the resulting error, asymtotically accurate recovery of flow-size distribution P (L = i) from samled residuals R, and analysis of sace- CPU requirements in steady state have not been exlored. We overcome these issues below. A. Single-Flow Usage We start with the roblem of obtaining sizes of individual flows for accounting uroses. Since residual samling requires an estimator to convert residuals into the metrics of interest, our first task is to define roer notation and desired roerties for the estimation algorithm. Assume that for a flow of size L the samling algorithm roduces residual R L, where both L and R L are random variables. We call an estimator e(r L ) unbiased if its exectation roduces the correct flow size, i.e., E[e(R L ) L = l] = E[e(R l )] = l. Unbiased estimation allows one to average the of several flows of a given size l and accurately estimate their total contribution. We further call an estimator elehant-accurate if ratio e(r l )/l converges to 1 in mean-square as l. Elehant-accuracy ensures that the variance of e(r l )/l tends to zero as l, which means that the amount of relative error between e(r l ) and l becomes negligible for large flows. Prior work on RGS [11], [17] has suggested the following estimator: e(r L ) = R L 1 + 1/, (1) where 0 < 1 is the arameter of geometric variable A. To understand the erformance of (1), we first build a general robabilistic for residual-geometric samling and derive the relationshi between L and its residual R L. Using this result, we rove that: E[e(R l )] = l 1 (1 ) l, (2)
2 which indicates that (2) is generally biased and on average tends to overestimate the original by a factor of u to 1/. To address this roblem, we derive a different estimator: ê(r L ) = R L 1 + 1/ (1 )R L and rove that it is both unbiased and elehant-accurate. We also derive in closed-form the mean-square error δ l = E[(ê(R l )/l 1) 2 ] for finite l, which can be used to determine when (3) aroximates the true with accuracy sufficient for billing uroses. B. Flow-Size Distribution Our second roblem is estimation of the original flow-size robability mass function (PMF), which we assume is given by f i = P (L = i), i = 1, 2,... We call PMF estimator q i asymtotically unbiased if it converges in robability to f i for all i as the number of samled flows M. One may be at first temted to comute this distribution based on the values roduced by either (1) or (3) for each observed flow; however, we show that such q i almost always differ from the original distribution f i and the bias ersists as samle size M. The reason for this discreancy is that e(.) and ê(.) both estimate the sizes of flows that have been samled by the algorithm, which are not reresentative of the entire oulation assing through the router. As longer flows are more likely to be selected by residual samling, this aroach overestimates their fraction and skews the PMF towards the tail. Denote by M i the number of samled flows with R L = i and define a new estimator: (3) q i = M i (1 )M i+1 M + (1 )M 1. (4) Using the general of RGS derived later in the aer, we rove that q i tends to f i in robability as M = i M i and obtain the amount of error q i f i for finite M. We also rovide asymtotically unbiased estimators for the total number of flows n: ñ = M + 1 M 1 (5) and the number of flows n i with exactly i ackets: ñ i = M i (1 )M i+1, (6) where ñ/n 1 and ñ i /n i 1, both in robability, as M. We call the resulting combination (3)-(6) Unbiased Residual-Geometric Estimators (URGE). C. Performance Evaluation To reduce RAM overhead, our imlementation eriodically discards flow records if the corresonding flows have comleted (i.e., FIN, RST ackets detected) or if no ackets from these flows arrive within some timeout τ. Unfortunately, no analytical results are available on the number of flows M(t) that a router needs to track in steady state or the amount of RAM needed to kee the counters. We overcome this roblem by deriving E[M(t)] in equilibrium and showing that it can be orders of magnitude smaller than both the total number of flows n and the number of samled flows M. We finish the aer by evaluating URGE with real Internet traces obtained from NLANR [24] and CAIDA [3]. Our exeriments reveal that the roosed algorithm roduces very accurate estimation of flow metrics and thus allows one to erform more aggressive samling (i.e., smaller robability ) of the monitored traffic. We also discover in exeriments that URGE works very well even on short traces, which makes it suitable for monitoring small customer networks and individual rotocols. II. RELATED WORK In this section, we review several samling algorithms in the area of traffic monitoring. In articular, we classify existing work into two categories: acket samling and flow samling, where the former makes er-acket and the latter er-flow decisions to samle incoming traffic. A. Packet Samling Samled NetFlow (SNF) [25] is a widely used technique in which incoming ackets are samled with a fixed robability. The general goal of SNF is to obtain the PMF of flow sizes; however, [14] shows that it is imossible to accurately recover the original flow-size distribution from samled SNF data. Estan et al. [10] roose Adative NetFlow (ANF), which adjusts the samling robability according to the size of the flow table; however, ANF s bias in the samled data is equivalent to that in SNF and is similarly difficult to overcome in ractice. Instead of using one uniform robability for all flows as in [10], [25], another direction in acket samling is to comute i (c) for each flow i based on its currently observed size c. This aroach has been studied by two indeendent aers, Sketch-Guided Samling (SGS) [20] and Adative Non-Linear Samling (ANLS) [15]. A common feature of these two methods is to samle a new flow with robability 1 and then monotonically decrease i (c) as c grows. Both methods must maintain a counter for each flow resent in the network and are difficult to scale due to the high RAM/CPU usage. B. Flow Samling In flow thinning [14], each flow is samled indeendently with robability and then all ackets in samled flows are counted. Hohn et al. [14] show that flow thinning is able to accurately estimate the distribution; however, this method tyically misses 1 ercent of elehant flows and thus does not suort alications such as usage-based accounting and traffic engineering [6], [11], [12], [13], [26]. For highly skewed distributions with a few extremely large flows and many short ones (which is tyical for Internet links), this method may also take a long time to converge. To address these roblems of flow thinning, Estan et. al. [11] introduce a size-deendent flow samling algorithm called Samle-and-Hold (S&H), which is roosed to identify 2
3 elehant flows. For each acket from a new flow, the algorithm creates a flow counter with robability ; once a flow is samled, all of its subsequent ackets are then counted. It is easy to verify that S&H samles a flow with size l with robability 1 (1 ) l, which quickly aroaches 1 as l grows. Creating a unifying analytical for this aroach and understanding the roerties of samles it collects is the main toic of this aer. Another direction of size-deendent flow samling has been exlored by Duffield et al. in [5], [6], [8], which resent another size-deendent flow measurement method called Smart Samling. Their aroach selects each flow of size L with robability (L) = min(1, L/z), where z is some constant. Since this method requires L before deciding whether to samle it or not, it can only be alied off-line. Komella et. al. [17] examine a method called Flow Slicing (FS), which combines SNF and S&H with a variant of smart samling. Other non-samling methods include exact counting [27], [31] and lossy counting [18], [22], which are orthogonal to our work. III. UNDERLYING MODEL In this section, we build a general robabilistic of Samle-and-Hold [11] and establish the necessary analytical foundation for the results that follow. Due to limited sace, omitted roofs, s, and various imlementation details can be found in the extended technical reort [28]. A. Samle-and-Hold Consider a sequence of ackets traversing a router and assume that its flow-measurement algorithm checks each acket s flow identifier x in some RAM table. If x is found in the table, the corresonding counter is incremented by 1; otherwise, with robability a new entry for x is created in the table (with counter value 1) and with robability 1 the acket is ignored. To this rocess, we first need several definitions. Assume that s are i.i.d. random variables and define geometric age A L to be the number of ackets discarded from the front of a flow with size L before it is samled (see Fig. 1). Let G be a shifted geometric random variable with success robability, i.e., P (G = j) = (1 ) j. It thus follows that A L is simly: A L = min(g, L). (7) Now define geometric residual R L to be the final counter value of a flow of size L conditioned on the fact that it has been samled (i.e., A L < L): R L = L A L, (8) which is also illustrated in Fig. 1. From the ersective of traffic monitoring in this aer, geometric residual R L is the only quantity collected during measurement and available to an estimation algorithm. Since this aroach belongs to the class of residual-samling techniques [29] and secifically uses geometric age, this aer calls S&H by a more mathematicallysecific name Residual-Geometric Samling (RGS). ackets in a flow Age A L L Residual R L Fig. 1. Residual-geometric samling of a flow with size L. discarded acket samled acket Assume that L has a PMF f i = P (L = i), where i = 1, 2,..., and denote by s = P (A L < L) the robability that a random flow is samled. Then, we have the following result. Lemma 1: Probability s that a flow is selected by RGS is: s = E[1 (1 ) L ] = 1 f i (1 ) i. (9) i=1 Next, let h i = P (R L = i) be the PMF of geometric residual R L. The following lemma exresses h i in terms of f i. Lemma 2: The PMF of geometric residual R L is: h i = j=i f j(1 ) j i s. (10) The result of Lemma 2 is fundamental as most of the results in this aer are conveniently derived from (10). B. Fixed Flow Size We next analyze a secial case of residual samling where the original is fixed at L = l. Note that residuals are now R l instead of R L since the original is no longer a random variable. Recall that the goal of single-flow size estimation is to obtain l from R l for each samled flow. The next corollary follows from (10) and gives the distribution and exectation of geometric residual R l. Corollary 1: Given L = l, the PMF of R l is: and its exectation is: P (R l = i) = (1 )l i 1 (1 ) l (11) E[R l ] = l + 1 1/. (12) 1 (1 ) l Next, we aly the results obtained in this section to analyze existing estimation methods that have been roosed for RGS. IV. ANALYSIS OF EXISTING METHODS In this section, we examine rior aroaches [11], [17] to estimating single-flow usage and whether their results can be generalized to recover the PMF of L. A. Single-Flow Usage To evaluate single-flow estimators, we use the following definition that is commonly used in statistics [2]. Definition 1: Estimator e(r l ) is called unbiased if E[e(R l )] = l for all l 1. Unbiased estimation is a key roerty of an estimator as it allows accurate estimation of the total contribution from a sufficiently large ool of flows (e.g., one customer network). However, since large flows are tyically rare, one commonly 3
4 unbiased unbiased Fig. 2. Exectation of estimator (13) in s and its (14). Fig. 3. RRMSE of (13) in s and its (16). faces an additional requirement to estimate their size with just a single samle e(r l ), which is formalized in the next definition. Definition 2: Estimator e(r l ) is called elehant-accurate if e(r l )/l 1 in mean-square as l. Elehant-accuracy guarantees that the amount of relative error between e(r l ) and l decays to zero as l. As before, suose that a flow of size l roduces a counter with value R l. Recall that [11], [17] suggest the following estimator: e(r l ) = R l 1 + 1/, (13) where is the robability of residual-geometric samling. The next result directly follows from (12). Theorem 1: Exectation E[e(R l )] is given by: E[e(R l )] = l 1 (1 ) l. (14) Note that (14) indicates that (13) is generally biased, esecially when l is small. Indeed, for l 0, we have 1 (1 ) l l and E[e(R l )] 1/ regardless of l, which shows that in such cases E[e(R l )] carries no information about the original. However, as l, it is straightforward to verify that the bias in e(r l ) vanishes exonentially, which is consistent with the analysis in [17], which has only considered the case of l. To see the extent of bias in (13) and verify (14), we aly residual-geometric samling to flows of size l ranging from 1 to ackets, feed the measured sizes to (13), and average the result after 1000 iterations for each l. Fig. 2 lots the obtained E[e(R l )] along with (14). The figure indicates that (14) indeed catures the bias and that (13) tends to overestimate the size of short flows even in exectation, where smaller samling robability leads to more error. To quantify the error of individual values e(r l ) in estimating l and to understand elehant-accuracy, denote by Y l = e(r l )/l and define the Relative Root Mean Square Error (RRMSE) to be: δ l = E[(Y l 1) 2 ]. (15) Note that δ l 0 indicates that Y l 1 in mean-square and thus imlies elehant-accurate estimation. The next result derives δ l in closed form. Theorem 2: The RRMSE of (13) is given by: 1 l(l 1) δ l = 2 (1 ) l (1 ) l+1 l 2 2 (1 (1 ) l. (16) ) Observe from (16) that for flows with size l = 1, the relative error is 1 /, but as l, δ l 0 and the estimator is elehant-accurate. Fig. 3 lots (16) against s, indicating a close match. The figure also shows that the RRMSE starts from 1/ and decreases towards zero as Θ(1/l) as l. B. Flow-Size Distribution We now investigate whether e(r L ) defined in (13) can be used to estimate the actual flow-size distribution {f i } i=1. Denote by q i = P (e(r L ) = i) the PMF of s among the samled flows. To understand our objectives with aroximating the PMF of L, the following definition is in order. Definition 3: An estimator {q i } i=1 of PMF {f i} i=1 is called asymtotically unbiased if q i converges in robability to f i for all i as the number of samled flows M. The next theorem follows directly from (10). Theorem 3: The PMF of s estimated from (13) is given by: j=y(i) q i = f j(1 ) j y(i), (17) s where y(i) = i + 1 1/ and s is in (9). The result in (17) indicates that each q i is different from f i regardless of the samling duration and thus cannot be used to aroximate the flow-size distribution. We verify (17) with a simulated acket stream with 5M flows, where flow sizes follow a ower-law distribution P (L i) = 1 i α for i = 1, 2,... and α = 1.1. Fig. 4 lots the CCDF of random variable e(r L ) obtained from s as well as (17), both in comarison to the tail of the actual distribution. The figure shows that (17) accurately redicts the values obtained from s and that PMF {q i } is indeed quite different from {f i }. So far, our study of existing methods in residual-geometric samling has shown that they are not only generally biased, but also unable to recover the flow-size distribution from residuals R L. This motivates us to seek better estimation aroaches, which we erform next. 4
5 Fig. 4. Distribution {q i } in s and its (17). Fig. 6. RRMSE of (19) in s and (21). unbiased Fig. 5. unbiased Exectation of estimator (19) in s. V. URGE This section rooses a family of algorithms called Unbiased Residual-Geometric Estimators (URGE), roves their accuracy, and verifies them in s. A. Single-Flow Usage For estimating individual s, we first consider an estimator directly imlied by the result in (12). Notice that solving (12) for l and exressing l in terms of E[R l ], we get: 1 ( ) l = u log(1 ) W u(1 ) u log(1 ), (18) where u = E[R l ] + 1/ 1 and W (z) is Lambert s function (i.e., a multi-valued solution to W e W = z) [4]. Thus, a ossible estimator can be comuted from (18) with E[R l ] relaced by the measured value of geometric residual R l. However, there are two reasons that (18) is a bad estimator of s. First, Lambert s function W (z) has no closed form solution and has to be numerically solved using tools such as Matlab. Second, it can be verified (not shown here for brevity) that (18) is not an unbiased estimator. Instead, we define a new estimator: ê(r l ) = R l 1 + 1/ (1 )R l. (19) and next show that it is unbiased. Lemma 3: Estimator ê(r l ) in (19) is unbiased, i.e., E[ê(R l )] = l. (20) We lot in Fig. 5 results obtained from (19). The figure indicates that ê(r l ) accurately estimates s for all flows in both cases of. Next, we derive the RRMSE of URGE. Theorem 4: The RRMSE of (19) is given by: 1 + l( 2)(1 ) ˆδ l = l (1 ) 2l+1 l 2 2 (1 (1 ) l. (21) ) It is easy to verify from (21) that URGE has zero RRMSE for l = 1 or l, confirming its elehant-accuracy. We lot ˆδ l obtained from s along with the in Fig. 6, which shows that (21) accurately tracks the actual relative error. From Figures 5-6, it is clear that ê(r l ) significantly imroves the accuracy of estimating small s comared to e(r l ). In ractice, (21) can be used to determine threshold l 0, which leads to desired bounds on error for all l l 0 and allows ISPs to use e(r l ) instead of l. B. Flow-Size Distribution It is worth mentioning that while (19) roduces unbiased estimation of s, ê(r L ) is not suitable for comuting the flow-size distribution, as we show below. Denote by ˆq i = P (ê(r L ) = i) the PMF of ê(r L ). Then, we have the following result. Lemma 4: PMF of ê(r L ) is given by: ˆq i = 1 s j=y(i) where s is in (9), function y(i) is: (1 ) j y(i) f j, (22) y(i) = i + 1 1/ ω, (23) and ω = W ( (1 ) i+1 1/ log(1 ) ). Notice from (22)-(23) that distribution ˆq i does not even remotely aroximate the original PMF f i. This roblem is fundamental since residual samling exhibits bias towards larger flows and even if we could recover L from R L exactly, the distribution of samled s would not accurately aroximate that of all flows assing through the router. We thus exlore another technique for estimating the flowsize distribution. Before doing that, we need the next lemma. Lemma 5: The distribution f i can be exressed using the PMF of geometric residuals {h i } in (10) as: f i = h i (1 )h i+1 + (1 )h 1. (24) 5
6 , M = 194, 208, M = 26, 233 (a) =, M = 3, 090 (b) = 10 5, M = 337 Fig. 7. Estimator (25) in s. Fig. 8. Estimator (25) in s with very small. This result leads to a new estimator for the flow-size distribution: q i = M i (1 )M i+1 M + (1 )M 1, (25) where M is the total number of samled flows and M i is the number of them with the geometric residual equal to i. Since M i /M h i in robability as M (from the weak law of large numbers), we immediately get the following result. Corollary 2: The estimator in (25) is asymtotically unbiased. We next verify the accuracy of q i in s with 5M flows in the same setting as in the revious section. We lot in Fig. 7 the CCDF estimated from (25) along with the actual distribution. The figure shows that q i accurately follows the for both cases of. C. Convergence Seed We next examine the effect of samle size M on the convergence of estimator q i. To illustrate the roblems arising from small M, we study (25) with = and 10 5 in s with the same 5M flows. The estimator obtained M = 3, 090 flows for = and just M = 337 for = Fig. 8 indicates that while the estimated curves under both choices of still aroximate the trend of the original distribution, they exhibit different levels of noise. As the next result indicates, small leads to a small samle size M and thus more noise in the estimated values. Corollary 3: Suose that M flows are selected by residual-geometric samling from a total of n flows. Then, the exected value of M is given by: E[M] = n s = ne[1 (1 ) L ]. (26) To shed light on the choice of roer for RGS, we show how to determine the minimum M that would guarantee a certain level of accuracy in q i. Define h i = M i /M to be an estimate of h i = P (R L = i). The next lemma follows from Lemma 5 and Corollary 2 and indicates that the accuracy of q i directly deends on whether h i aroximates h i accurately. Lemma 6: Suose that h j h j ηh j holds with robability 1 ξ for j [1, i + 1] and small constants η and ξ. Then, there exists a constant ζ: ζ = η( + 2η(1 )h 1) + (1 )(1 η)h 1 (27) such that ζ 0 as η 0 and P ( q i f i ζf i ) = 1 ξ. Next, we obtain a bound on M from the requirement that h i be bounded in robability within a given range [h i (1 η), h i (1 + η)]. Theorem 5: For small constants η and ξ, h i h i ηh i holds with robability 1 ξ if samle size M is no less than: M (1 h i) h i η 2 ( Φ 1 (1 ξ/2) ) 2, (28) where Φ(x) is the CDF of the standard Gaussian distribution N (0, 1). For examle, to bound h i within 10% ercent of h i (i.e., η = 0.1) with robability 1 ξ = 95% for all h i, the following must hold: M ( ) , (29) which indicates that M = 38K flows must be samled to achieve target accuracy. If we reduce η to 1%, increase 1 ξ to 99%, and require the aroximation to hold for all h i, then M must be at least 66M flows. Converting η into ζ using (27), one can establish similar bounds on the deviation of q i from f i. D. Estimation of Other Flow Metrics Besides s and the flow-size distribution, URGE also rovides estimators for the total number of flows and the number of them with size i. Before introducing these estimators, we need the next lemma. Lemma 7: The exected number of flows with samled residuals R L = i is: E[M i ] = E[M]h i = nh i s, (30) where h i is the PMF of geometric residuals R L and s is given by (9). Based on this, we next develo two estimators and rove their accuracy. Let ñ be an estimator of the total number of flows n observed in the measurement window [0, T ]: ñ = M + (1 ) M 1 (31) and ñ i be an estimator of the number of flows n i with size i: ñ i = M i (1 )M i+1. (32) 6
7 Then, the next result shows that both of these estimators are asymtotically unbiased. Lemma 8: Ratios ñ/n and ñ i /n i converge to 1 in robability as M. Note that [17] rovided a similar estimator as (31) and roved E[ñ] = n using a different aroach from ours; however, our results are stronger as they show convergence in robability and additionally address estimation of n i. Simulations verifying (31)-(32) are omitted for brevity. E. Active Flows In a tyical imlementation of URGE, one needs a flow table to kee a maing between flow identifiers and associated counters. An imortant element of any samling algorithm is to ensure that the table kees only active flows, which can be accomlished by eriodic swees through RAM and removal of all flows that have comleted. Such a strategy together with RGS can achieve a significant reduction in the table size. To understand how much benefit removal of dead flows rovides to memory consumtion, we next derive the exected number of active flows at any time t and their fraction samled by the algorithm. Assume a measurement window [0, T ], where T is given in ackets seen by the router. For each flow j, let inter-acket delays within the flow be given by a random variable j, which counts the number of acket arrivals from other flows between adjacent ackets of j. Denoting by = E[ j ], we have the following result. Lemma 9: Assuming stationary flow arrivals in [0, T ] and T, the exected number of active flows N(t) at time t is given by: E[N(t)] = + 1. (33) Our baseline reduction in flow volume comes from geometric samling in revious sections and reduces the number of flows by a factor of r 1 = n/e[m]. Now additionally define ratio r 2 = n/e[n(t)] = T/( + 1)E[L] and observe that longer observation windows (i.e., larger T ), smaller flow sizes (i.e., smaller E[L]), and denser arrivals (i.e., smaller ) imly more savings of memory. In fact, T results in r 2 if the other arameters are fixed. However, even more reduction is ossible by discarding dead flows in RGS. Denote by M(t) the number of samled flows that are still alive at t and consider the next result. Lemma 10: Assuming the flow arrival rocess is stationary in [0, T ] and T, the exected number of active samled flows at time t is given by: ( E[M(t)] = ( + 1) 1 1 ) E[L] s, (34) where s in (9) is the fraction of all flows samled by RGS. Define r 3 = n/e[m(t)] and notice that it increases not only as T grows, but also when decreases. Performing a selfcheck using Jensen s inequality, observe that 0 s /E[L] 1 and therefore E[M(t)] E[N(t)], which means that the former indeed always results in more reduction in table size. Simulations with heavy-tailed flows (omitted due to limited sace) show that the is very accurate. actual estimated (a) E[e(R l )] actual estimated (b) E[ê(R l )] Fig. 9. Estimating single-flow usage in the FRG trace with = (a) δ l (b) ˆδ l Fig. 10. RRMSE of single-flow usage in the FRG trace with = VI. PERFORMANCE EVALUATION In this section, we evaluate our s using several Internet traces in Table I from NLANR [24] and CAIDA [3]. Trace FRG was collected from a gigabit link between UCSD and Abilene in We extracted from it additional traces with only Web, DNS, and NTP flows (also seen in the table). Additionally, we use three traces from CAIDA: LARGE a one-hour trace from an OC48 link, MEDIUM a one-minute trace from a OC192 link, and SMALL a 7-minute trace from a gigabit link. As the table shows, URGE tyically sees a reasonably large number of flows M over the entire interval [0, T ]; however, the number of active flows N(t) and those constantly ket in memory M(t) is much smaller. For the FRG trace, for examle, E[M] is 15 times smaller than n, while E[N(t)] is 81 and E[M(t)] is 658 times smaller. In general, NLANR traces benefit more from the removal of dead flows than CAIDA data, because former was collected over two consecutive days and thus had a larger observation window T, which led to larger ratios r 2 and r 3. The same reasoning also exlains the fact that the LARGE trace exhibits much higher benefit from removing dead flows than MEDIUM or SMALL traces. A. Estimation Accuracy First, we examine the roblem of estimating the total number of flows n in [0, T ] and size-one flows n 1 in this interval. The fifth and eighth columns of Table II list the absolute error of s (31) and (32), resectively. The table indicates that these estimates are commonly within 2.5% of the correct value. We next evaluate the erformance of URGE in estimating single-flow usage. Fig. 9 lots the exectation of estimated 7
8 TABLE I REDUCTION IN THE NUMBER OF FLOWS USING RESIDUAL SAMPLING WITH = 0.01 AND DIFFERENT TYPES OF PERIODIC REMOVAL OF DEAD FLOWS source trace total flows n total kts ne[l] samling only removal only both E[M] r 1 E[N(t)] r 2 E[M(t)] r 3 FRG 1, 756, , 821, , , , NLANR Web 239, 174 6, 497, , , DNS 120, , 977 2, , 797 NTP 382, , 447 4, , , 887 LARGE 9, 653, , 250, , , , CAIDA MEDIUM 2, 317, , 837, , , , SMALL 200, 9, 179, , , , source NLANR CAIDA TABLE II PERFORMANCE OF URGE WITH = trace # of flows # of size-one flows actual (n) estimated (ñ) error actual (n 1 ) estimated (ñ 1 ) error FRG 1, 756, 702 1, 736, % 768, , % Web 239, , % 13, , % DNS 120, , % 76, , % NTP 382, , % 281, , % LARGE 9, 653, 609 9, 717, % 4, 535, 449 4, 630, % MEDIUM 2, 317, 369 2, 278, % 1, 299, 343 1, 273, % SMALL 200, 902, % 93, , % (c) = Fig. 11. Estimating the distribution using URGE in the FRG trace. s (averaged over 100 iterations) along with the actual values obtained from the FRG trace using = The figure shows that the estimator e(r l ) from revious work tends to overestimate the sizes of small flows, while URGE s estimator ê(r l ) accurately follows the actual values. We also comare the relative errors of the two studied methods in Fig. 10, which indicates that URGE has RRMSE bounded by 1 for all flows, while e(r l ) exhibits very large δ l for small and medium flows, which is an increasing function of 1/. For the flow-size distribution, we first examine three values of to comare its effect on the accuracy of URGE in the FRG trace. Fig. 11 indicates that estimation for all three values of are very consistent and all of them follow the accurately. In our exeriments with = , URGE recovered the original PMF {f i } using only M = 7, 616 total flows out of n = 1.75M. Finally, we aly URGE with = to NLANR traces of different traffic tyes and lot in Fig. 12 the estimated distributions along with the actual ones. As the figure shows, the flow statistics of different alications can be accurately estimated by URGE. We observe a similar match in our exeriments with three CAIDA traces as shown in Fig. 13. VII. CONCLUSION In this aer, we roved that revious methods based on residual-geometric samling had certain bias in estimating single-flow usage and were unable to recover the flow-size distribution from the samled residuals. To overcome this limitation, we roosed a novel ing framework for analyzing residual samling and develoed a set of algorithms that were able to erform accurate estimation of flow statistics, even under the constraints of small router RAM size, short trace duration, and low CPU samling overhead. REFERENCES [1] D. Brauckhoff, B. Tellenbach, A. Wagner, A. Lakhina, and M. May, Imact of Traffic Samling on Anomaly Detection Metrics, in Proc. ACM IMC, Oct. 2006, [2] G. Casella and R. L. Berger, Statistical Inference, 2nd ed. Duxbury/Thomson Learning, [3] Cooerative Association for Internet Data Analysis (CAIDA). [Online]. Available: htt:// [4] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth, On the Lambert W Function, Advances in Comutational Mathematics, vol. 5, , [5] N. Duffield and C. Lund, Predicting Resource Usage and Estimation Accuracy in an IP Flow Measurement Collection Infrastructure, in Proc. ACM IMC, Oct. 2003,
9 10 5 (a) Web (b) NTP (c) DNS Fig. 12. Estimating the distribution using URGE in NLANR traces with = (a) LARGE 10 5 (b) MEDIUM (c) SMALL Fig. 13. Estimating the distribution using URGE in CAIDA traces with = [6] N. Duffield, C. Lund, and M. Thoru, Charging from Samled Network Usage, in Proc. ACM IMW, Nov. 2001, [7] N. Duffield, C. Lund, and M. Thoru, Estimating Flow Distributions from Samled Flow Statistics, in Proc. ACM SIGCOMM, Aug. 2003, [8] N. Duffield, C. Lund, and M. Thoru, Flow Samling under Hard Resource Constraints, in Proc. ACM SIGMETRICS, Jun. 2004, [9] N. Duffield, C. Lund, and M. Thoru, Learn More, Samle Less: Control of Volume and Variance in Network Measurement, IEEE Trans. Inform. Theory, vol. 51, no. 5, , May [10] C. Estan, K. Keys, D. Moore, and G. Varghese, Building a Better Netflow, in Proc. ACM SIGCOMM, Aug. 2004, [11] C. Estan and G. Varghese, New Directions in Traffic Measurement and Accounting, in Proc. ACM SIGCOMM, Aug. 2002, [12] W. Fang and L. Peterson, Inter-AS Traffic Patterns and their Imlications, in Proc. IEEE GLOBECOM, Dec. 1999, [13] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True, Deriving Traffic Demands for Oerational IP Networks: Methodology and Exerience, IEEE/ACM Trans. Netw., vol. 9, no. 3, , Jun [14] N. Hohn and D. Veitch, Inverting Samled Traffic, in Proc. ACM IMC, Oct. 2003, [15] C. Hu, S. Wang, J. Tian, B. Liu, Y. Cheng, and Y. Chen, Accurate and Efficient Traffic Monitoring Using Adative Non-linear Samling Method, in Proc. IEEE INFOCOM, Ar. 2008, [16] K. Ishibashi, R. Kawahara, T. Mori, T. Kondoh, and S. Asano, Effect of Samling Rate and Monitoring Granularity on Anomaly Detectability, in Proc. IEEE Global Internet Symosium, May 2007, [17] R. R. Komella and C. Estan, The Power of Slicing in Internet Flow Measurement, in Proc. USENIX/ACM IMC, Oct. 2005, [18] A. Kumar, M. Sung, J. Xu, and J. Wang, Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution, in Proc. ACM SIGMETRICS, Jun. 2004, [19] A. Kumar, M. Sung, J. Xu, and E. Zegura, A Data Streaming Algorithm for Estimating Suboulation Flow Size Distribution, in Proc. ACM SIGMETRICS, Jun. 2005, [20] A. Kumar and J. Xu, Sketch Guided Samling Using On-Line Estimates of Flow Size for Adative Data Collection, in Proc. IEEE INFOCOM, Ar. 2006, [21] A. Kumar, J. Xu, J. Wang, O. Satschek, and L. Li, Sace-Code Bloom Filter for Efficient Per-Flow Traffic Measurement, in Proc. IEEE INFOCOM, Mar. 2004, [22] Y. Lu, A. Montanari, B.Prabhakar, S. Dharmaurikar, and A. Kabbani, Counter Braids: A Novel Counter Architecture for Per-Flow Measurement, in Proc. ACM SIGMETRICS, Jun. 2008, [23] J. Mai, C.-N. Chuah, A. Sridharan, T. Ye, and H. Zang, Is Samled Data Sufficient for Anomaly Detection? in Proc. ACM IMC, Oct. 2006, [24] National Laboratory for Alied Network Research (NLANR). [Online]. Available: htt://moat.nlanr.net/. [25] Cisco Samled NetFlow. [Online]. Available: htt:// US/docs/ios/12 0s/feature/guide/12s sanf.html. [26] R. Pan, L. Breslau, B. Prabhakar, and S. Shenker, Aroximate Fairness through Differential Droing, ACM SIGCOMM Com. Comm. Rev., vol. 33, no. 2, , Ar [27] S. Ramabhadran and G. Varghese, Efficient Imlementation of a Statistics Counter Architecture, in Proc. ACM SIGMETRICS, Jun. 2003, [28] X. Wang, X. Li, and D. Loguinov, Modeling Residual-Geometric Flow Samling (extended version), Texas A&M University, Tech. Re , Dec [Online]. Available: htt://irl.cs.tamu.edu/ublications/. [29] X. Wang, Z. Yao, and D. Loguinov, Residual-Based Measurement of Peer and Link Lifetimes in Gnutella Networks, in Proc. IEEE INFOCOM, May 2007, [30] L. Yang and M. Michailidis, Samled Based Estimation of Network Traffic Flow Characteristics, in Proc. IEEE INFOCOM, May 2007, [31] Q. Zhao, J. Xu, and Z. Liu, Design of a Novel Statistics Counter Architecture with Otimal Sace and Time Efficiency, in Proc. ACM SIGMETRICS, Jun. 2006,
Modeling Residual-Geometric Flow Sampling
1 Modeling Residual-Geometric Flow Samling Xiaoming Wang, Xiaoyong Li, and Dmitri Loguinov Abstract Traffic monitoring and estimation of flow arameters in high seed routers have recently become challenging
More informationModeling Residual-Geometric Flow Sampling
Modeling Residual-Geometric Flow Sampling Xiaoming Wang Joint work with Xiaoyong Li and Dmitri Loguinov Amazon.com Inc., Seattle, WA April 13 th, 2011 1 Agenda Introduction Underlying model of residual
More informationAn Analysis of TCP over Random Access Satellite Links
An Analysis of over Random Access Satellite Links Chunmei Liu and Eytan Modiano Massachusetts Institute of Technology Cambridge, MA 0239 Email: mayliu, modiano@mit.edu Abstract This aer analyzes the erformance
More informationCombining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)
Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment
More informationOn the Relationship Between Packet Size and Router Performance for Heavy-Tailed Traffic 1
On the Relationshi Between Packet Size and Router Performance for Heavy-Tailed Traffic 1 Imad Antonios antoniosi1@southernct.edu CS Deartment MO117 Southern Connecticut State University 501 Crescent St.
More information4. Score normalization technical details We now discuss the technical details of the score normalization method.
SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules
More informationEstimation of the large covariance matrix with two-step monotone missing data
Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo
More informationDistributed Rule-Based Inference in the Presence of Redundant Information
istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced
More informationShadow Computing: An Energy-Aware Fault Tolerant Computing Model
Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms
More informationMATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK
Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment
More informationCharacterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations
Characterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations PINAR KORKMAZ, BILGE E. S. AKGUL and KRISHNA V. PALEM Georgia Institute of
More informationRobust Lifetime Measurement in Large- Scale P2P Systems with Non-Stationary Arrivals
Robust Lifetime Measurement in Large- Scale P2P Systems with Non-Stationary Arrivals Xiaoming Wang Joint work with Zhongmei Yao, Yueping Zhang, and Dmitri Loguinov Internet Research Lab Computer Science
More informationAn Analysis of Reliable Classifiers through ROC Isometrics
An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit
More informationOn split sample and randomized confidence intervals for binomial proportions
On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have
More informationCHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules
CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is
More informationAI*IA 2003 Fusion of Multiple Pattern Classifiers PART III
AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationApproximating min-max k-clustering
Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost
More informationImproved Capacity Bounds for the Binary Energy Harvesting Channel
Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,
More informationState Estimation with ARMarkov Models
Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,
More informationThe non-stochastic multi-armed bandit problem
Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at
More informationarxiv: v1 [physics.data-an] 26 Oct 2012
Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationA MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION
O P E R A T I O N S R E S E A R C H A N D D E C I S I O N S No. 27 DOI:.5277/ord73 Nasrullah KHAN Muhammad ASLAM 2 Kyung-Jun KIM 3 Chi-Hyuck JUN 4 A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST
More informationUsing the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process
Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund
More informationFor q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i
Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:
More informationGeneralized Coiflets: A New Family of Orthonormal Wavelets
Generalized Coiflets A New Family of Orthonormal Wavelets Dong Wei, Alan C Bovik, and Brian L Evans Laboratory for Image and Video Engineering Deartment of Electrical and Comuter Engineering The University
More informationCONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP
Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität
More informationRobustness of classifiers to uniform l p and Gaussian noise Supplementary material
Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668
More informationA Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression
Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi
More informationAge of Information: Whittle Index for Scheduling Stochastic Arrivals
Age of Information: Whittle Index for Scheduling Stochastic Arrivals Yu-Pin Hsu Deartment of Communication Engineering National Taiei University yuinhsu@mail.ntu.edu.tw arxiv:80.03422v2 [math.oc] 7 Ar
More informationFisher Information in Flow Size Distribution Estimation
1 Fisher Information in Flow Size Distribution Estimation Paul Tune, Member, IEEE, and Darryl Veitch, Fellow, IEEE Abstract The flow size distribution is a useful metric for traffic modeling and management
More informationSupplementary Materials for Robust Estimation of the False Discovery Rate
Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides
More informationAnalysis of Multi-Hop Emergency Message Propagation in Vehicular Ad Hoc Networks
Analysis of Multi-Ho Emergency Message Proagation in Vehicular Ad Hoc Networks ABSTRACT Vehicular Ad Hoc Networks (VANETs) are attracting the attention of researchers, industry, and governments for their
More informationRobust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System
Vol.7, No.7 (4),.37-38 htt://dx.doi.org/.457/ica.4.7.7.3 Robust Predictive Control of Inut Constraints and Interference Suression for Semi-Trailer System Zhao, Yang Electronic and Information Technology
More informationA Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition
A Qualitative Event-based Aroach to Multile Fault Diagnosis in Continuous Systems using Structural Model Decomosition Matthew J. Daigle a,,, Anibal Bregon b,, Xenofon Koutsoukos c, Gautam Biswas c, Belarmino
More informationGeneral Linear Model Introduction, Classes of Linear models and Estimation
Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)
More informationMulti-Operation Multi-Machine Scheduling
Multi-Oeration Multi-Machine Scheduling Weizhen Mao he College of William and Mary, Williamsburg VA 3185, USA Abstract. In the multi-oeration scheduling that arises in industrial engineering, each job
More informationAdaptive estimation with change detection for streaming data
Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment
More informationNotes on Instrumental Variables Methods
Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of
More informationAnalysis of some entrance probabilities for killed birth-death processes
Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction
More informationAn Investigation on the Numerical Ill-conditioning of Hybrid State Estimators
An Investigation on the Numerical Ill-conditioning of Hybrid State Estimators S. K. Mallik, Student Member, IEEE, S. Chakrabarti, Senior Member, IEEE, S. N. Singh, Senior Member, IEEE Deartment of Electrical
More informationLower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data
Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment
More informationProbability Estimates for Multi-class Classification by Pairwise Coupling
Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics
More informationDeveloping A Deterioration Probabilistic Model for Rail Wear
International Journal of Traffic and Transortation Engineering 2012, 1(2): 13-18 DOI: 10.5923/j.ijtte.20120102.02 Develoing A Deterioration Probabilistic Model for Rail Wear Jabbar-Ali Zakeri *, Shahrbanoo
More informationMODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL
Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management
More informationCERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education
CERIAS Tech Reort 2010-01 The eriod of the Bell numbers modulo a rime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education and Research Information Assurance and Security Purdue University,
More informationLinear diophantine equations for discrete tomography
Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,
More informationAsymptotically Optimal Simulation Allocation under Dependent Sampling
Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee
More informationA PEAK FACTOR FOR PREDICTING NON-GAUSSIAN PEAK RESULTANT RESPONSE OF WIND-EXCITED TALL BUILDINGS
The Seventh Asia-Pacific Conference on Wind Engineering, November 8-1, 009, Taiei, Taiwan A PEAK FACTOR FOR PREDICTING NON-GAUSSIAN PEAK RESULTANT RESPONSE OF WIND-EXCITED TALL BUILDINGS M.F. Huang 1,
More informationUncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning
TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment
More informationSystem Reliability Estimation and Confidence Regions from Subsystem and Full System Tests
009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract
More informationSignaled Queueing. Laura Brink, Robert Shorten, Jia Yuan Yu ABSTRACT. Categories and Subject Descriptors. General Terms. Keywords
Signaled Queueing Laura Brink, Robert Shorten, Jia Yuan Yu ABSTRACT Burstiness in queues where customers arrive indeendently leads to rush eriods when wait times are long. We roose a simle signaling scheme
More information2-D Analysis for Iterative Learning Controller for Discrete-Time Systems With Variable Initial Conditions Yong FANG 1, and Tommy W. S.
-D Analysis for Iterative Learning Controller for Discrete-ime Systems With Variable Initial Conditions Yong FANG, and ommy W. S. Chow Abstract In this aer, an iterative learning controller alying to linear
More informationRadial Basis Function Networks: Algorithms
Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.
More informationA Simple Throughput Model for TCP Veno
A Simle Throughut Model for TCP Veno Bin Zhou, Cheng Peng Fu, Dah-Ming Chiu, Chiew Tong Lau, and Lek Heng Ngoh School of Comuter Engineering, Nanyang Technological University, Singaore 639798 Email: {zhou00,
More informationPlotting the Wilson distribution
, Survey of English Usage, University College London Setember 018 1 1. Introduction We have discussed the Wilson score interval at length elsewhere (Wallis 013a, b). Given an observed Binomial roortion
More informationA Study of Active Queue Management for Congestion Control
A Study of Active Queue Management for Congestion Control Victor Firoiu vfiroiu@nortelnetworks.com Nortel Networks 3 Federal St. illerica, MA 1821 USA Marty orden mborden@tollbridgetech.com Tollridge Technologies
More informationModeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation
6.3 Modeling and Estimation of Full-Chi Leaage Current Considering Within-Die Correlation Khaled R. eloue, Navid Azizi, Farid N. Najm Deartment of ECE, University of Toronto,Toronto, Ontario, Canada {haled,nazizi,najm}@eecg.utoronto.ca
More informationNew Schedulability Test Conditions for Non-preemptive Scheduling on Multiprocessor Platforms
New Schedulability Test Conditions for Non-reemtive Scheduling on Multirocessor Platforms Technical Reort May 2008 Nan Guan 1, Wang Yi 2, Zonghua Gu 3 and Ge Yu 1 1 Northeastern University, Shenyang, China
More informationJohn Weatherwax. Analysis of Parallel Depth First Search Algorithms
Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel
More informationCOMMUNICATION BETWEEN SHAREHOLDERS 1
COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov
More informationA SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE
THE 19 TH INTERNATIONAL CONFERENCE ON COMPOSITE MATERIALS A SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE K.W. Gan*, M.R. Wisnom, S.R. Hallett, G. Allegri Advanced Comosites
More informationEstimating function analysis for a class of Tweedie regression models
Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal
More informations v 0 q 0 v 1 q 1 v 2 (q 2) v 3 q 3 v 4
Discrete Adative Transmission for Fading Channels Lang Lin Λ, Roy D. Yates, Predrag Sasojevic WINLAB, Rutgers University 7 Brett Rd., NJ- fllin, ryates, sasojevg@winlab.rutgers.edu Abstract In this work
More informationPaper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation
Paer C Exact Volume Balance Versus Exact Mass Balance in Comositional Reservoir Simulation Submitted to Comutational Geosciences, December 2005. Exact Volume Balance Versus Exact Mass Balance in Comositional
More informationLECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]
LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for
More informationNamed Entity Recognition using Maximum Entropy Model SEEM5680
Named Entity Recognition using Maximum Entroy Model SEEM5680 Named Entity Recognition System Named Entity Recognition (NER): Identifying certain hrases/word sequences in a free text. Generally it involves
More informationAn Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem
An Ant Colony Otimization Aroach to the Probabilistic Traveling Salesman Problem Leonora Bianchi 1, Luca Maria Gambardella 1, and Marco Dorigo 2 1 IDSIA, Strada Cantonale Galleria 2, CH-6928 Manno, Switzerland
More informationCovariance Matrix Estimation for Reinforcement Learning
Covariance Matrix Estimation for Reinforcement Learning Tomer Lancewicki Deartment of Electrical Engineering and Comuter Science University of Tennessee Knoxville, TN 37996 tlancewi@utk.edu Itamar Arel
More information8 STOCHASTIC PROCESSES
8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular
More informationSampling and Distortion Tradeoffs for Bandlimited Periodic Signals
Samling and Distortion radeoffs for Bandlimited Periodic Signals Elaheh ohammadi and Farokh arvasti Advanced Communications Research Institute ACRI Deartment of Electrical Engineering Sharif University
More informationElementary Analysis in Q p
Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some
More informationA Special Case Solution to the Perspective 3-Point Problem William J. Wolfe California State University Channel Islands
A Secial Case Solution to the Persective -Point Problem William J. Wolfe California State University Channel Islands william.wolfe@csuci.edu Abstract In this aer we address a secial case of the ersective
More informationarxiv: v3 [physics.data-an] 23 May 2011
Date: October, 8 arxiv:.7v [hysics.data-an] May -values for Model Evaluation F. Beaujean, A. Caldwell, D. Kollár, K. Kröninger Max-Planck-Institut für Physik, München, Germany CERN, Geneva, Switzerland
More informationWeakly Short Memory Stochastic Processes: Signal Processing Perspectives
Weakly Short emory Stochastic Processes: Signal Processing Persectives by Garimella Ramamurthy Reort No: IIIT/TR/9/85 Centre for Security, Theory and Algorithms International Institute of Information Technology
More informationFinite Mixture EFA in Mplus
Finite Mixture EFA in Mlus November 16, 2007 In this document we describe the Mixture EFA model estimated in Mlus. Four tyes of deendent variables are ossible in this model: normally distributed, ordered
More informationUniversal Finite Memory Coding of Binary Sequences
Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv
More informationSTABILITY ANALYSIS AND CONTROL OF STOCHASTIC DYNAMIC SYSTEMS USING POLYNOMIAL CHAOS. A Dissertation JAMES ROBERT FISHER
STABILITY ANALYSIS AND CONTROL OF STOCHASTIC DYNAMIC SYSTEMS USING POLYNOMIAL CHAOS A Dissertation by JAMES ROBERT FISHER Submitted to the Office of Graduate Studies of Texas A&M University in artial fulfillment
More informationTests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)
Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant
More informationRe-entry Protocols for Seismically Active Mines Using Statistical Analysis of Aftershock Sequences
Re-entry Protocols for Seismically Active Mines Using Statistical Analysis of Aftershock Sequences J.A. Vallejos & S.M. McKinnon Queen s University, Kingston, ON, Canada ABSTRACT: Re-entry rotocols are
More informationAnalysis of execution time for parallel algorithm to dertmine if it is worth the effort to code and debug in parallel
Performance Analysis Introduction Analysis of execution time for arallel algorithm to dertmine if it is worth the effort to code and debug in arallel Understanding barriers to high erformance and redict
More informationOne-way ANOVA Inference for one-way ANOVA
One-way ANOVA Inference for one-way ANOVA IPS Chater 12.1 2009 W.H. Freeman and Comany Objectives (IPS Chater 12.1) Inference for one-way ANOVA Comaring means The two-samle t statistic An overview of ANOVA
More informationChapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population
Chater 7 and s Selecting a Samle Point Estimation Introduction to s of Proerties of Point Estimators Other Methods Introduction An element is the entity on which data are collected. A oulation is a collection
More informationComputer arithmetic. Intensive Computation. Annalisa Massini 2017/2018
Comuter arithmetic Intensive Comutation Annalisa Massini 7/8 Intensive Comutation - 7/8 References Comuter Architecture - A Quantitative Aroach Hennessy Patterson Aendix J Intensive Comutation - 7/8 3
More informationAn Outdoor Recreation Use Model with Applications to Evaluating Survey Estimators
United States Deartment of Agriculture Forest Service Southern Research Station An Outdoor Recreation Use Model with Alications to Evaluating Survey Estimators Stanley J. Zarnoch, Donald B.K. English,
More informationBayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis
HIPAD LAB: HIGH PERFORMANCE SYSTEMS LABORATORY DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING AND EARTH SCIENCES Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis Why use metamodeling
More informationStatics and dynamics: some elementary concepts
1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and
More informationVision Graph Construction in Wireless Multimedia Sensor Networks
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Conference and Worksho Paers Comuter Science and Engineering, Deartment of 21 Vision Grah Construction in Wireless Multimedia
More informationDETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS
Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM
More informationMulti-Channel MAC Protocol for Full-Duplex Cognitive Radio Networks with Optimized Access Control and Load Balancing
Multi-Channel MAC Protocol for Full-Dulex Cognitive Radio etworks with Otimized Access Control and Load Balancing Le Thanh Tan and Long Bao Le arxiv:.3v [cs.it] Feb Abstract In this aer, we roose a multi-channel
More informationGenetic Algorithms, Selection Schemes, and the Varying Eects of Noise. IlliGAL Report No November Department of General Engineering
Genetic Algorithms, Selection Schemes, and the Varying Eects of Noise Brad L. Miller Det. of Comuter Science University of Illinois at Urbana-Chamaign David E. Goldberg Det. of General Engineering University
More informationPositive decomposition of transfer functions with multiple poles
Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.
More informationObjectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)
Objectives 6.1, 7.1 Estimating with confidence (CIS: Chater 10) Statistical confidence (CIS gives a good exlanation of a 95% CI) Confidence intervals. Further reading htt://onlinestatbook.com/2/estimation/confidence.html
More informationResearch Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces
Abstract and Alied Analysis Volume 2012, Article ID 264103, 11 ages doi:10.1155/2012/264103 Research Article An iterative Algorithm for Hemicontractive Maings in Banach Saces Youli Yu, 1 Zhitao Wu, 2 and
More informationImproved Bounds on Bell Numbers and on Moments of Sums of Random Variables
Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating
More informationMobility-Induced Service Migration in Mobile. Micro-Clouds
arxiv:503054v [csdc] 7 Mar 205 Mobility-Induced Service Migration in Mobile Micro-Clouds Shiiang Wang, Rahul Urgaonkar, Ting He, Murtaza Zafer, Kevin Chan, and Kin K LeungTime Oerating after ossible Deartment
More informationDeriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.
Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &
More informationarxiv: v1 [cs.ro] 24 May 2017
A Near-Otimal Searation Princile for Nonlinear Stochastic Systems Arising in Robotic Path Planning and Control Mohammadhussein Rafieisakhaei 1, Suman Chakravorty 2 and P. R. Kumar 1 arxiv:1705.08566v1
More informationMetrics Performance Evaluation: Application to Face Recognition
Metrics Performance Evaluation: Alication to Face Recognition Naser Zaeri, Abeer AlSadeq, and Abdallah Cherri Electrical Engineering Det., Kuwait University, P.O. Box 5969, Safat 6, Kuwait {zaery, abeer,
More information