Replication Strategies in Unstructured Peer-to-Peer Networks

Size: px

Start display at page:

Download "Replication Strategies in Unstructured Peer-to-Peer Networks"

Clyde Mathews
6 years ago
Views:

1 Replication Strategies in Unstructured Peer-to-Peer Networks Edith Cohen AT&T Labs Research 180 Park Avenue Florha Park, NJ 07932, USA Scott Shenker ICSI Berkeley, CA USA ABSTRACT The Peer-to-Peer (P2P) architectures that are ost prevalent in today s Internet are decentralized and unstructured. Search is blind in that it is independent of the query and is thus not ore effective than probing randoly chosen peers. One technique to iprove the effectiveness of blind search is to proactively replicate data. We evaluate and copare different replication strategies and reveal interesting structure: Two very coon but very different replication strategies unifor and proportional yield the sae average perforance on successful queries, and are in fact worse than any replication strategy which lies between the. The optial strategy lies between the two and can be achieved by siple distributed algoriths. These fundaental results offer a new understanding of replication and show that currently deployed replication strategies are far fro optial and that optial replication is attainable by protocols that reseble existing ones in siplicity and operation. Categories and Subject Descriptors C.2 [Counication Networks]; H.3 [Inforation Storage and Retrieval]; F.2 [Analysis of Algoriths] General Ters Algoriths Keywords replication; peer-to-peer; rando search 1. INTRODUCTION Peer-to-peer (P2P) systes, alost unheard of three years ago, are now one of the ost popular Internet applications and a very significant source of Internet traffic. While Napster s recent legal troubles ay lead to its deise, there are Perission to ake digital or hard copies of all or part of this work for personal or classroo use is granted without fee provided that copies are not ade or distributed for profit or coercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific perission and/or a fee. SIGCOMM 02, August 19-23, 2002, Pittsburgh, Pennsylvania, USA. Copyright 2002 ACM X/02/ $5.00. any other P2P systes that are continuing their eteoric growth. Despite their growing iportance, the perforance of P2P systes is not yet well understood. P2P systes were classified by [9] into three different categories. Soe P2P systes, such as Napster [6], are centralized in that they have a central directory server to which users can subit queries (or searches). Other P2P systes are decentralized and have no central server; the hosts for an ad hoc network aong theselves and send their queries to their peers. Of these decentralized designs, soe are structured in that they have close coupling between the P2P network topology and the location of data; see [12, 10, 11, 15, 2] for a sapling of these designs. Other decentralized P2P systes, such as Gnutella [3] and FastTrack [13]- based Morpheus [5] and KaZaA [4], are unstructured with no coupling between topology and data location. Each of these design styles centralized, decentralized structured, and decentralized unstructured have their advantages and disadvantages and it is not our intent to advocate for a particular choice aong the. However, the decentralized unstructured systes are the ost coonly used in today s Internet. They also raise an iportant perforance issue how to replicate data in such systes and that perforance issue is the subject of this paper. In these decentralized unstructured P2P systes, the hosts for a P2P overlay network; each host has a set of neighbors that are chosen when it joins the network. A host sends its query (e.g., searching for a particular file) to other hosts in the network; Gnutella uses a flooding algorith to propagate the query, but any other query-propagation approaches are possible. The fundaental point is that in these unstructured systes, because the P2P network topology is unrelated to the location of data, the set of nodes receiving a particular query is unrelated to the content of the query. A host doesn t have any inforation about which other hosts ay best be able to resolve the query. Thus, these blind search probes can not be on average ore effective than probing rando nodes. Indeed, siulations in [9] suggest that rando probing is a reasonable odel for search perforance in these decentralized unstructured P2P systes. To iprove syste perforance, one wants to iniize the nuber of hosts that have to be probed before the query is resolved. One way to do this is to replicate the data on several hosts. 1 That is, either when the data is originally 1 For the basic questions we address, it does not atter if the actual data is replicated or if only pointers to the data

2 stored or when it is the subject of a later search, the data can be proactively replicated on other hosts. Gnutella does not support proactive replication, but at least part of the current success of FastTrack-based P2P networks can be attributed to replication: FastTrack designates high-bandwidth nodes as search-hubs (super-nodes). Each supernode replicates the index of several other peers. As a result, each FastTrack search probe eulates several Gnutella probes and thus is uch ore effective. It is clear that blind search is ore effective when a larger index can be viewed per probe, but even though FastTrack increased per-probe capacity, it basically uses the sae replication strategy as Gnutella: The relative index capacity dedicated to each ite is proportional to the nuber of peers that have copies. Thus, current approaches to replication are largely iplicit. We consider the case where one designs an explicit replication strategy. The fundaental question we address in our paper is: given fixed constraints on per-probe capacity, what is the optial way to replicate data? 2 We define replication strategies that, given a query frequency distribution, specify for each ite the nuber of copies ade. The ain etric we consider is the perforance on successful queries, which we easure by the resulting expected (or average) search size A. We also consider perforance on insoluble queries, which is captured by the axiu search size allowed by the syste. Our analysis reveals several surprising fundaental results. We first consider two natural but very different replication strategies: Unifor and Proportional. The Unifor strategy, replicating everything equally, appears naive, whereas the Proportional strategy, where ore popular ites are ore replicated, is designed to perfor better. However, we show that the two replication strategies have the sae expected search size on successful queries. Furtherore, we show that the Unifor and Proportional strategies constitute two extree points of a large faily of strategies which lie between the two and that any other strategy in this faily has better expected search size. We then show that one of the strategies in this faily, Square-root replication, iniizes the expected search size on successful queries. Proportional and Unifor replications, however, are not equivalent. Proportional akes popular ites easier to find and less popular ites harder to find. In particular, Proportional replication requires a uch higher liit on the axiu search size while Unifor iniizes this liit and, thus, iniizes resources consued on processing insoluble queries. We show that the axiu search size with Square-root strategy is in-between the two (though closer to Unifor), and define a range of optial strategies that balance the processing of soluble and insoluble queries. Interestingly, these optial strategies lie between Unifor and Square-root and are far fro Proportional. Last, we address how to ipleent these replication strategies in the distributed setting of an unstructured P2P network. It is easy to ipleent the Unifor and Proporare replicated. The actual use depends on the architecture and is orthogonal to the scope of this paper. Thus, in the sequel, copies refers to actual copies or pointers. 2 Our analysis assues adaptive terination echanis, where the search is stopped once the query is resolved. See [7] for a discussion of a different question related to replication in P2P networks without adaptive terination. tional replication strategies; for Unifor the syste creates a fixed nuber of copies when the ite first enters the syste, for Proportional the syste creates a fixed nuber of copies every tie the ite is queried. What isn t clear is how to ipleent the Square-root replication policy in a distributed fashion. We present several siple distributed algoriths which produce optial replication. Surprisingly, one of these algoriths, path replication, is ipleented, in a soewhat different for, in the Freenet P2P syste. These results are particularly intriguing given that Proportional replication is close to what is used by current P2P networks: With Gnutella there is no proactive replication; if we assue that all copies are the result of previous queries, then the nuber of copies is proportional to the query rate for that ite. Even FastTrack, which uses proactive replication, replicates the coplete index of each node at a search hub so the relative representation of each ite reains the sae. Our results suggest that Proportional, although intuitively appealing, is far fro optial and a siple distributed algorith which is consistent in spirit with Fast- Track replication is able to obtain optial replication. In the next section we introduce our odel and etrics and present a precise stateent of the proble. In Section 3 we exaine the Unifor and Proportional replication strategies. In Section 4 we show that Square-root replication iniizes the expected search size on soluble queries and provide soe coparisons to Proportional and Unifor replication. Section 5 defines the optial policy paraeterized by varying cost of insoluble queries. In Section 6 we present distributed algoriths that yields Square-root replication and present siulation results of its perforance. 2. MODEL AND PROBLEM STATEMENT The network consists of n nodes, each with capacity ρ which is the nuber of copies/keys that the node can hold. 3 Let R = nρ denote the total capacity of the syste. There are distinct data ites in the syste. The noralized vector of query rates takes the for q = q 1 q 2 q with È q i = 1. The query rate q i is the fraction of all queries that are issued for the ith ite. An allocation is a apping of ites to the nuber of copies of that ite (where we assue there is no ore than one copy per node). We let r i denote the nuber of copies of the i th ite (r i counts all copies, including the original È one), and let p i r i/r be the fraction of the total syste capacity allotted to ite i: ri = R. The allocation is represented by the vector p = (r 1/R, r 2/R,..., r /R). A replication or allocation strategy is a apping fro the query rate distribution q to the allocation p. We assue R ρ because outside of this region the proble is either trivial (if ρ the optial allocation is to have copies of all ites on all nodes) or insoluble (if > R there is no allocation with all ites having at least one copy). Our analysis is geared for large values of n (and R = ρn) with our results generally stated in ters of p and ρ with n factored out. Thus, we do not concern ourselves with integrality restrictions on the r i s. However, we do care about the bounds on the quantities p i. Since r i 1, we 3 Later on in this section we explain how to extend the definitions, results, and etrics to the case where nodes have heterogeneous capacities.

3 have p i l where l = 1. Since there is no reason to have R ore than one copy of an ite on a single node, we have r i n and so p i u where u = n = R ρ 1. Later in this section we will discuss reasons why these bounds ay be ade ore strict. We argued in the introduction that perforance of blind search is captured well by rando probes. This abstraction allows us to evaluate perforance without having to consider the specifics of the overlay structure. Siulations in [9] show that rando probes are a reasonable odel for several conceivable designs including Gnutella-like overlays. Specifically, the search echanis we consider is rando search: The search repeatedly draws a node uniforly at rando and asks for a copy of the ite; the search is stopped when the ite is found. The search size is the nuber of nodes drawn until an answer is found. With the rando search echanis, search sizes are rando variables drawn fro a Geoetric distribution with expectation ρ/p i. Thus, perforance is deterined by how any nodes have copies of any particular ite. For a query distribution q and an allocation p, we define the expected search size (ESS) A q(p) to be the expected nuber of nodes one needs to visit until an answer to the query is found, averaged over all ites. It is not hard to see that A q(p) = 1/ρ( i q i/p i). (1) The set of legal allocations P is a polyhedron defined by the ( 1)-diensional siplex of distributions on ites, intersected with an -diensional hypercube, which defines upper and lower bounds on the nuber of copies of each ite: p i = 1 (2) l p i u. (3) The legal allocation that iniizes the expected search size is the solution to the optiization proble Miniize q i/p i such that p P. One obvious property of the optial solution is onotonicity u p 1 p 2 p l (4) (If p is not onotone then consider two ites i, j with q i > q j and p i < p j. The allocation with p i and p j swapped is legal, if p was legal, and has a lower ESS.) In the next section we explore the perforance of two coon replication strategies, Unifor and Proportional. However, we first discuss refineents of our basic odel. 2.1 Bounded search size and insoluble queries So far we have assued that searches continue until the ite is found, but we now discuss applying these results in a ore realistic setting when searches are truncated when soe axial search size L is reached. We say an ite is locatable if a search for it is successful with high probability. Evidently, an ite is locatable if the fraction of nodes with a copy ρp i is sufficiently large with respect to L. Clearly, truncated searches have soe likelihood of failing altogether. The probability of failure is at ost 2 C if p i C/(ρL). Thus, if C = O(log L) then the failure probability would be polynoially sall in L. In order to copare two allocations under truncated search, we set the relation between L and l such that the sae base set of ites is locatable. We now argue that when all ites are locatable, Expression (1) constitutes a close approxiation of the expected truncated search size. The distribution of search sizes under truncated rando searches is a Geoetric distribution with paraeter p i, where values greater by L are truncated by L. Thus, Expression (1), which odels untruncated search, constitutes an upper bound on the truncated expected search size. The error, however, is at ost in i Èj 1 (1 ρpi)l+j = in i(1 ρp i) L /(ρp i) exp( C)/(ρl) = (L/C) exp( C). In the sequel, we assue that l is such that l > ln L/(ρL) and use the untruncated approxiation (1) for the expected search size (it is within an additive ter of exp( lρl/ln L) < 1). Also note that the likelihood for an unsuccessful search for a locatable ite is at ost exp( lρl) < 1/L. Thus, even though each locatable ite can have soe failed searches, these searches constitute a very sall fraction of total searches for the ite. The axiu search size paraeter L is also iportant for analyzing the cost of insoluble queries; that is, queries ade to ites that are not locatable. In actual systes, soe fraction of queries are insoluble, and search perfored on such queries would continue until the axiu search size is exceeded. The cost of these queries is not captured by the ESS etric, but is proportional to L and to the fraction of queries that are insoluble. When coparing replication strategies on the sae set of locatable ites, we need to consider both the ESS, which captures perforance on soluble queries and L, which captures perforance on insoluble queries. In this situation we assue that the respective L(p) is the solution of in i p i = ln L/(ρL). Thus, if f s is the fraction of queries which are soluble and (1 f s) is the fraction of insoluble queries, the perforance of the allocation p is f sa q(p) + (1 f s)l(p). (5) 2.2 Heterogeneous capacities and bandwidth So far we consider a hoogeneous setting, where all copies have the sae size and all nodes have the sae storage and the sae likelihood of getting probed. In reality, hosts have different capacities and bandwidth and, in fact, the ost recent wave of unstructured P2P networks [5, 4] exploits this asyetry. For siplicity of presentation we will keep using the hoogeneous setting in the sequel, but we note that all out results can be generalized in a fairly straightforward way to heterogeneous systes. Suppose nodes have capacities ρ i and visitation weight v i (in the hoogeneous case v i = 1, in general v i is the factor in which the visitation rate differs fro the average). It is not hard È to see that the average capacity seen per probe is ρ = i viρi. The quantity ρ siply replaces ρ in Equation 1 and in particular, the ESS of all allocations are affected by the sae factor; thus all our findings on the relative perforance of different allocations also apply for heterogeneous capacities and visitation rates. An issue that arises when the replication is of copies (rather than pointers) is that ites often have very different sizes. Our subsequent analysis can be extended to this case by treating the ith ite as a group of c i ites with the sae

4 query rate, where c i is the size of ite i. As a result we obtain altered definitions of the three basic allocations we consider: È Proportional has p i = c iq i/ j cjqj (proportional to query rate and size). È Unifor has p i = c i/ j cj (proportional to ite s size). Square-root has p i = c i qi/ È j cj qj (proportional to size and to the square-root of the query rate). 3. ALLOCATION STRATEGIES 3.1 Unifor and Proportional We now address the perforance of two replication strategies. The Unifor replication strategy is where all ites are equally replicated: Definition 3.1. Unifor allocation is defined when l 1/ u and has p i = 1/ for all i = 1,...,. This is a very priitive replication strategy, where all ites are treated identically even though soe ites are ore popular than others. One wouldn t, initially, think that such a strategy would produce good results. When there are restriction on the search size (and thus on l), Unifor allocation has the property that it is defined for all q for which soe legal allocation exists: Any allocation p other than Unifor ust have i where p i > 1/ and j where p j < 1/; If Unifor results in allocations outside the interval [l, u], then either 1/ > u (and thus p i > u) or 1/ < l (and thus p j < l). An iportant appeal of the Unifor allocation is that it iniizes the required axiu search size, and thus, iniizes syste resources spent on insoluble queries. It follows fro Equation (5) that when a large fraction of queries are insoluble, Unifor is (close to) optial. Another natural replication strategy is to have r i be proportional to the query rate. Definition 3.2. Proportional allocation is defined when l q i u. The Proportional solution has p i = q i for all i. Soe of the intuitive appeal of Proportional allocation is that it iniizes the axiu utilization rate [9]. This etric is relevant when the replication is of copies rather than of pointers; that is, when a successful probe is uch ore expensive to process than an unsuccessful probe. The utilization rate of a copy is the average rate of requests it serves. Under rando search, all copies of the sae ite i have the sae utilization rate q i/p i; when there are ore copies, each individual copy has a lower utilization. To avoid hot-spots, it is desirable to have low values for the axiu utilization rate È of a copy ax i q i/p i. The average utilization over all copies, piqi/pi = 1, is independent of the allocation p. The proportional allocation has q i/p i = 1 for all i, and so it clearly iniizes the axial value ax i q i/p i. When copared to Unifor, Proportional iproves the ost coon searches at the expense of the rare ones, which presuably would iprove overall perforance. We now analyze soe of the properties of these two strategies. A straightforward calculation reveals the following surprising result: Lea 3.1. Proportional and Unifor allocations have the sae expected search size A = /ρ, which is independent of the query distribution. We note that Lea 3.1 easily generalizes to all allocations that are a ix of Proportional and Unifor, in which each ite i has p i {1/, q i}. We next characterize the space of allocations. 3.2 Characterizing allocations As a warup, we consider the space of allocations for two ites ( = 2). We shall see that Proportional and Unifor constitute two points in this space with anything between the achieving better perforance, and anything outside having worse perforance. Consider a pair of ites with q i q j. The range of allocations is defined by a single paraeter 0 < x < 1, with p i/(p i +p j) = x and p j/(p i +p j) = (1 x). Proportional corresponds to x = q i/(q i + q j), and Unifor to x = 0.5. The range 0.5 x q i/(q i + q j) captures allocations between Unifor and Proportional. The outside range 0 < x < 0.5 contains non-onotone allocations where the less-popular ite obtains a larger allocation. The outside range 1 > x > q i/(q i + q j) contains allocations where the relative allocation of the ore popular ite is larger than its relative query rate. These different allocations are visualized in Figure 1(A), which plots p i/p j as a function of q i/q j. The ESS for these two ites is proportional to q i/x + q j/(1 x). This function has equal value on x = 0.5 and x = q i/(q i +q j) and is convex. The iniu is obtained at soe iddle point in the between range. By taking the first derivative, equating it to zero, and solving the resulting quadratic equation, we obtain that the iniu is obtained at x = q i/( q i + q j). Figure 1(B) shows the expected search size when using these allocations on two ites ( = 2 and ρ = 1). In this case, the axiu gain factor by using the optial allocation over Unifor or Proportional is 2. In the sequel we extend the observations ade here to arbitrary nuber of ites ( 2). In particular, we develop a notion of an allocation being between Unifor and Proportional and show that these allocations have better ESS, and that outside allocations have worse ESS. We also define the policy that iniizes the ESS, and we bound the axiu gain as a function on, u, and l. 3.3 Between Unifor and Proportional For = 2, we noticed that all allocations that lie between Unifor and Proportional have saller ESS. For general we first define a precise notion of being between Unifor and Proportional: Definition 3.3. An allocation p lies between Unifor and Proportional if for any pair of ites i < j we have q i/q j p i/p j 1; that is, the ratio of allocations p i/p j is between 1 ( Unifor ) and the ratio of their query rates q i/q j ( Proportional ). Note that this faily includes Unifor and Proportional and that all allocations in this faily are onotone. We now establish that all allocations in this faily other than Unifor and Proportional have a strictly better expected search size: Theore 3.1. Consider an allocation p between Unifor and Proportional. Then p has an expected search size of at

5 p2/p Relative allocations for two ites worse ESS than Proportional/Unifor Unifor Square-Root better ESS than Proportional/Unifor worse ESS than Proportional Proportional/Unifor ESS Expected Search Size for =2; rho=1; q_2=1-q_ q2/q1 (A) 1.2 Unifor/Proportional 1 Square-Root q1 (B) Figure 1: (A) The space of allocations on two ites (B) The average query cost for two ites ost /ρ. Moreover, if p is different fro Unifor or Proportional then its expected search size is strictly less than /ρ. Proof. The liits q i/q j p i/p j 1 hold if and only if they hold for any consecutive pair of ites (that is, for j = i + 1). Thus, the set of between allocations is characterized by the 1-diensional polyhedron obtained by intersecting È the 1-diensional siplex (the constraints p i = 1 p i > 0 defining the space of all allocations) with the additional constraints p i p i+1 and p i+1 p iq i+1/q i. Observe È that the expected search size function F(p 1,..., p ) = qi/pi is convex. Thus, its axiu value(s) ust be obtained on vertices of this polyhedron. The vertices are allocations where for any 1 i <, either p i = p j or p i = p jq i/q j. We refer to these allocations as vertex allocations. It reains to show that the axiu of the expected search size function over all vertex allocations is obtained on the Unifor or Proportional allocations. We show that if we are at a vertex other than Proportional or Unifor, we can get to Unifor or Proportional by a series of oves, where each ove is to a vertex with a larger ESS than the one we are currently at. Consider a vertex allocation p which is different than Proportional and Unifor. Let 1 < k < be the iniu such that the ratios p 2/p 1,..., p k+1 /p k are not all 1 or not all equal to the respective ratios q i+1/q i. Note that by definition we ust have that q i+1 < q i for at least one i = 1,..., k 1; and q k+1 < q k. (Otherwise, if q 1 = = q k then any vertex allocation is such that p 1,..., p k+1 are consistent with Unifor or with Proportional; if q k = q k+1 then iniality of k is contradicted). There are now two possibilities for the structure of the (k + 1)-prefix of p: 1. We have p i+1/p i = 1 for i = 1,..., k 1 and p k+1 /p k = q k /q k We have p i+1/p i = q i+1/q i for i = 1,..., k 1 and p k+1 /p k = 1. Clai: at least one of the two candidate vertices characterized by p : p i+1/p i = 1 for i = 1,..., k and p i+1/p i = p i+1/p i for (i > k), or p : p i+1/p i = q i+1/q i for i = 1,..., k and p p i+1/p i for (i > k). i+1/p i = has strictly worse ESS than p. Note that this clai concludes the proof: We can iteratively apply this ove, each tie selecting a worse candidate. After each ove, the prefix of the allocation which is consistent with either Unifor or Proportional is longer. Thus, after at ost 1 such oves, the algorith reaches the Unifor or the Proportional allocation. The proof of the clai is fairly technical and is deferred to the Appendix. We next consider the two failies of outside allocations and show that they perfor worse than Unifor and Proportional. Lea 3.2. Allocations such that one of the following holds j, p j < p j+1 (less popular ite gets larger allocation) or j, p j/p j+1 > q j/q j+1 (between any two ites, the ore popular ite get ore than its share according to query rates) perfor strictly worse than Unifor and Proportional. Proof. The arguents here are sipler, but follow the lines of the proof of Theore 3.1. We start with the first faily of outside allocations. Consider the intersection of the allocation siplex with the halfspace constraints p i p i+1. These constraints constitute a cone with one vertex which is the Unifor allocation. The resulting polyhedron obtained fro the intersection with the siplex has additional vertices, but note that all these vertices lie on the boundary of the siplex (and thus, the ESS function is infinite on the). We now need to show that the iniu of the ESS function over this polyhedron is obtained on the Unifor allocation. Recall that the function is convex,

6 and its global iniu is obtained outside this polyhedron. Thus, the iniu over the polyhedron ust be obtained on a vertex, but the only vertex with bounded ESS value is the Unifor allocation. Thus, the iniu ust be obtained on the Unifor allocation. Siilar arguents apply for the second faily. We consider the halfspace constraints p i+1 p iq i+1/q i. The resulting polyhedron includes the Proportional allocation as the only vertex with finite ESS value. The following observation will becoe useful when bounds are iposed on allocation sizes of ites. Lea 3.3. All allocations in the between faily result in a narrower range of possible ite allocation values than Proportional, that is, we have p 1 q 1 and p q. Proof. Consider such an È allocation p. We have that p i (q i/q 1)p 1. Thus 1 pi (È qi)p1/q1 = p 1/q 1. Hence, p 1 q 1. Siilarly p i (q i/q )p, and we obtain that p q. 4. THE SQUARE-ROOT ALLOCATION We consider an allocation where for any two ites, the ratio of allocations is the square root of the ratio of query rates. Note that this allocation lies between Unifor and Proportional in the sense of Definition 3.3. We show that this allocation iniizes the ESS. Definition 4.1. Square-root allocation is defined when l q i/ È i qi u and has p i = q i/ È i qi for all i. Lea 4.1. Square-root allocation, when defined, iniizes the expected search size. È Proof. The goal is to iniize qi/pi. This natural siple optiization proble had arisen in different contexts, e.g, the capacity assignent proble [8] and scheduling data broadcast [14]. We include a proof for the sake of copleteness. È 1 Substituting p = 1 pi we have F(p 1,..., p 1) = 1 1 q i/p i + q /(1 p i). We are looking for the iniu of F when È 1 pi < 1 and p i > 0. The value of F approaches when we get closer to the boundary of the siplex. Thus, the iniu ust be obtained at an interior point. By solving df/dp i = 0 we obtain that 1 p i = (1 Ô Ô p j) q i/q = p qi/q. j=1 4.1 How uch can we gain? Recall that for all q, the expected search size under both Unifor and Proportional allocations is /ρ. The expected search size under Square-Root allocation is ( q 1/2 i ) 2 /ρ, and depends on the query distribution. An interesting question is the potential gain of applying Square-root rather than Unifor or Proportional allocations. We refer to the ratio of the ESS under Unifor/Proportional to the ESS under Square-root as the gain factor. We first bound the gain factor by, u, and l (proof is in the Appendix): Lea 4.2. Let A SR be the expected search size using Square-root allocation. Let A unifor be the expected search size using Proportional or Unifor allocation. Then A unifor /A SR (u + l lu). Moreover, this is tight for soe distributions. Note that if l = 1/ or u = 1/ then the only legal allocation is 1/ on all ites, and indeed the gain factor is 1. If l 1/, the gain factor is roughly u. ESS Gain Factor w=0.5 w=1 w=1.5 w= e+06 Nuber of ites () Figure 2: The ratio of the ESS under Unifor/Proportional to the ESS under square root allocation, for truncated Zipf-like query distributions on ites. The query rate of the ith ost popular ite is proportional to i w. In [9] we considered truncated Zipf-like query distributions, where there are ites and the query rate of the ith ost popular ite is proportional to i w. A siple calculation (see Table 1) shows that the gain factor draatically grows with the skew. In particular, the gain factor is a conw ρa SR Gain factor w < 1 w = 1 1 < w < 2 1 w (1 w/2) 2 4/ln w 1 2 w (1 w/2) 2 (1 w/2) 2 1 w 1 4 ln (1 w/2)2 w 1 w = 2 ln 2 /ln 2 w > 2 w 1 (w/2 1) 2 (w/2 1) 2 w 1 w 1 Table 1: Asyptotic dependence on of the gain factor and ESS under Square-root allocation for truncated -ites Zipf distribution with power w. stant fraction for w < 1, logarithic in for w = 1, and (fractional) polynoial in for w > 1; for w > 2 the ESS

7 under Square-root is constant and the gain factor is thus linear in. Figure 2 plots the gain factor as a function of the nuber of ites for representative values of the power paraeter w. We next consider two natural query distribution obtained fro Web proxy logs (the Boeing logs described in [1]). We looked at the top Urls, with query rates proportional to the nuber of requests, and top hostnaes (Web sites), with query rates proportional to the nuber of users that issued a request to the site. Figure 3 shows the gain-factor and the ESS under Square-root allocation and Proportional allocation as a function of the nuber of ites for these two distributions. For larger values of, the ESS of Square-root allocation is 30%-50% of that of Proportional or Unifor allocations. Although these gaps are not as draatic as for highly-skewed Zipf distributions, they are substantial. The Unifor, Proportional, and Square-root allocations result in different values of p (the allocation assigned to the locatable ite with sallest query rate); in turn, this corresponds to different required iniu values of the axiu search size, which deterines the cost of insoluble queries. Figure 4 shows that for the hostnae distribution, the axiu search size under Square-root is within a factor of 2 of the sallest possible (Unifor allocation) whereas Proportional requires a uch larger axiu search size. In the next section we develop the optial policy, which iniizes the cobined resource consuption on soluble and insoluble queries. 5. SQUARE-ROOT AND PROPORTIONAL ALLOCATIONS Suppose now that we fix the set of locatable ites and the bound on the axiu search size (that is, we fix the lower bound l on the allocation of any one ite). Phrased differently, this is like fixing the resources consued on insoluble queries. With l (or u) fixed, Square-root allocation ay not be defined - since the sallest (largest) allocation ay be below (above) the bound. We now ask what is the replication strategy which iniizes the ESS under these constraints? We define a natural extension of Square-root, Squareroot, which always results in a legal allocation (when one exists, that is, when l 1/ u). Square-root allocation lies between Square-root and Unifor and we show that Square-root iniizes the ESS. Since Square-root iniizes the ESS while fixing the axiu search size, by sweeping l we obtain a range of optial strategies for any given ratio of soluble and insoluble queries. The extrees of these range are Unifor allocation, which is optial if insoluble queries copletely doinate and Square-root allocation which is optial if there are relatively few insoluble queries. 5.1 Square-root allocation Lea 5.1. Consider a query distribution q where q 1 q and l 1/ u. There is a unique onotone allocation p P for which the following conditions apply. 1. if u > p i, p j > l then p i/p j = Ô (q i/q j). 2. Ô if p j = l p i or if p j u = p i, then p i/p j (qi/q j). Furtherore, this allocation iniizes the ESS. The proof is deferred to the Appendix, and provides a siple iterative procedure to copute the Square-root allocation for arbitrary q i s. The Square-root allocation as a function of l varies between l q / È i qi (where Square-root coincides with the Square-root allocation) and l = 1/ (where Squareroot coincides with the Unifor allocation). On interediate values of l, a suffix of the ites is assigned the iniu allocation value l, but the reaining ites still have allocations proportional to the square-root of their query rate. In the range [ q / È i qi,1/], the ESS as a function of l is a piecewise hyperbolic increasing function that is iniized at l = q / È i qi. The breakpoints of this function correspond to a suffix of the ites that have iniu allocation. Basic algebraic È anipulations show that the breakpoints are l n = s n/(1 i=n si + sn( n + 1)), where s i = È q i/ j qj is the allocation of the ith ite under Square-root allocation. Note that the extrees of this range are indeed l = s (Square-root allocation) and l 1 = 1/ (Unifor allocation). The ESS for l [l n, l n 1) is given by i=n n 1 q i/l + ( q i/s i) È n 1 si 1 l( n + 1) and is increasing with l. The axiu search size needed to support the allocation is approxiately 1/l and is decreasing with l. The overall search size (as defined in Equation 5) is a convex cobination of the ESS and the MSS and is iniized inside the interval [s,1/]. Figure 5 illustrates the cobined expected search size for different ixes of soluble and insoluble queries as a function of l. When all queries are soluble (f s = 1), the iniu is obtained at l = s, where Square-root coincides with Square-root (which iniizes the ESS). At the other extree, when all queries are insoluble (f s = 0), the iniu is obtained at l = 1/, where Square-root coincides with Unifor (which iniizes the MSS). 5.2 Proportional allocation Siilarly, with l (or u) fixed, the Proportional allocation ay not be defined. We siilarly define Proportional allocation to be the legal allocation that iniizes the axiu utilization rate. Proportional allocation is defined whenever there exists a legal allocation, that is, when l 1/ u. We define Proportional allocation as the unique allocation defined by the following conditions: if u > p i, p j > l then p i/p j = q i/q j. if p j = l p i or if p j u = p i, then p i/p j q i/q j. When we have l q i u for all i then Proportional is the sae as Proportional. It follows fro Theore 3.1 that Proportional allocation has ESS no higher than Unifor allocation and when Proportional is different than Proportional (that is, there are q i < l or q i > u) then Proportional has a lower ESS than Unifor. Proportional allocation lies between Proportional and Unifor (in the sense of Theore 3.3).

8 Unifor,Proportional Square-root (hostnae) Square-root (url) hostnae(nuber of users) url(nuber of requests) ESS Gain Factor Nuber of Locatable Ites (A) e+06 Nuber of Locatable Ites (B) Figure 3: (A) The ESS of the different strategies versus the nuber of locatable ites (the x axis), for the hostnae and Url distributions. (B) The gain factor as a function of for the two distributions. Maxiu Search Size Max Search Size vs Nuber of Ites (hostnae dist.) Proportional Square-root Unifor Nuber of Locatable Ites Maxiu Search Size relative to Unifor Ratio of Max Search Size to Opt vs Nuber of Ites (hostnae dist.) Proportional/Unifor Square-root/Unifor Nuber of Locatable Ites Figure 4: Maxiu search size as a function of nuber of locatable ites, and ratio of axiu search size to the iniu possible (Unifor allocation). 6. DISTRIBUTED REPLICATION Our previous results identified the optial replication strategy. In this section we show that it can be achieved, via siple distributed protocols, in a decentralized unstructured P2P network. At least conceptually, The Unifor allocation can be obtained via a siple schee which replicates each ite in a fixed nuber of locations when it first enters the syste. We had seen that the optial allocation is a hybrid of Squareroot and Unifor allocations. We first develop algoriths that obtain Square-root allocation, and then discuss how they can be slightly odified to optially balance the cost of soluble and insoluble queries. In order to consider replication algoriths 4 aied at queryrate-dependent allocations (like Proportional and Squareroot), we odel a dynaic setting where copies are created 4 A note on terinology: a replication strategy is a apping between q and p whereas a replication algorith is a distributed algorith that realizes the desired replication strategy. and deleted: Creation: New copies can be created after each query. After a successful rando search, the requesting node creates soe nuber of copies, call it C, at randoly-selected nodes. This nuber C can only depend on quantities locally observable to the requesting node. Deletion: Copies do not reain in the syste forever; they can be deleted through any different echaniss. All we assue here of copies is that their lifeties are independent of the identity of the ite and the survival probability of a copy is non-increasing with its age: if two copies were generated at ties t 1 and t 2 > t 1 then the second copy is ore likely to still exist at tie t 3 > t 2. This creation and deletion processes are consistent with our search and replication odel and with the operation of unstructured networks. In our odel, and perhaps in reality, the network would perfor copy creation by visiting

9 Search size over all queries (all insoluble) MSS (fs=0%) 0.25*ESS+0.75*MSS (fs= 25%) 0.5*ESS+0.5*MSS (fs= 50%) 0.75*ESS+0.25*MSS (fs= 75%) (all soluble) ESS (fs=100%) Miniu allocation (l) Figure 5: The search size over all queries (soluble and insoluble), as a function of l, when Squareroot allocation is used (Hostnae distribution and =1000). The search size is shown for different ixes of soluble and insoluble queries, The point that iniizes the search size depends on the ix. nodes in a siilar fashion to how it perfors search. 5 Our copy deletion process is consistent with the expectation that in deployed networks, copy deletion occurs either by a node going offline or by soe replaceent procedure internal to a node. We expect node crashes to be unrelated to the content (part of the index) they possess. The replaceent procedure within each node should not discriinate copies based on access patterns. Two coon policies, Least Recently Used (LRU) or Least Frequently Used (LFU) are inconsistent with our assuption, but other natural policies such as First In First Out (FIFO), fixed lifetie durations, or rando deletions (when a new copy is created reove a cached copy selected at rando), are consistent with it. Let C i be the average value of C that is used for creating copies of ite i ( C i ay change over tie). We say that the syste is in steady state when the lifetie distribution of copies does not change over tie. We use the following property of this deletion process in our distributed replication algorith: Clai 6.1. If the ratio C i / C j reains fixed over tie and C i, C j are bounded fro below by soe constant, then p i/p j q i C i /(q j C j ) Thus, Proportional allocation is obtained if we use the sae value of C for all ites. A ore challenging task is designing algoriths that result in Square-root allocation. The challenge in achieving this is that no individual node issues or sees enough queries to estiate the query rate q i, and we would like to deterine C without using additional inter-host counication on top of what is required for the search and replication. We can use the above clai to obtain the following condition for Square-root allocation 5 This process autoatically adjust to designs where search is perfored on a fraction of nodes, or when nodes are not evenly utilized, such as with FastTrack, since hosts receive copies in the sae rate that they receive search probes. Corollary 6.1. If C i 1/ q i then p i/p j Ô q i/q j (the proportion factor ay vary with tie but should be the sae for all ites). We propose three algoriths that achieve the above property while obeying our general guidelines. The algoriths differ in the aount and nature of bookkeeping, and so which algorith is ore desirable in practice will depend on the details of the deployent setting. The first algorith, path replication, uses no additional bookkeeping; the second, replication with sibling-nuber eory, records with each copy the value C (nuber of sibling copies) used in its creation; the third probe eory has every node record a suary on each ite it sees a probe for. We show that under soe reasonable conditions, sibling-nuber and probeeory have C i close to its desired value and path replication has C i converge over tie to its desired value. 6.1 Path replication At any given tie, the expected search size for ite i, A i, is inversely proportional to the allocation p i. If C i is steady then p i q i C i and thus A i 1/(q i C i ). The Path replication algorith sets the nuber of new copies C to be the size of the search (the nuber of nodes probed). At the fixed point, when A i and C i are steady and equal, they are proportional to 1/ q i; and p i is proportional to q i. We show that in steady state, under reasonable conditions, A i and C i converge to this fixed point. Let A i be the value of A i at the fixed point. At a given point in tie, the rate of generating new copies is A i/a i ties the optial. This eans that the rate is higher (respectively, lower) than at the fixed point when the nuber of copies is lower (respectively, higher) than at the fixed point. If the tie between queries is at least of the order of the tie between a search and subsequent copy generation then path replication will converge to its fixed points. A possible disadvantage of path replication is that the current C i overshoots or undershoots the fixed point by a large factor (A i/a i). Thus, if queries arrive in large bursts or if the tie between search and subsequent copy generation is large copared to the query rate then the nuber of copies can fluctuate fro too-few to too-any and never reach the fixed point. Note that this convergence issue ay occur even for a large nuber of nodes. We next consider different algoriths that circuvents this issue by using C i that is close to this fixed-point value. 6.2 Replication with sibling-nuber eory Observe that the search size alone is not sufficient for estiating the query rate q i at any point in tie (path replication only reached the appropriate estiates at the fixed point). In order to have C i 1/ q i we ust use soe additional bookkeeping. Under replication with sibling-nuber eory (SNM), with each new copy we record the nuber of sibling copies that were generated when it was generated, and its generation tie. The algorith assues that each node known the historic lifetie distribution as a function of the age of the copy (thus, according to our assuptions, the age of the copy provides the sibling survival rate). Consider soe past query for the ite, a j, let d j > 0 be the nuber of sibling copies generated after a j and let λ j be the expected fraction of surviving copies at the current tie.

10 Suppose that every copy stores (d, λ) as above. Let T be such that copies generated T tie units ago have a positive probability of survival. Let P T be the set of copies with age at ost T. Then, È c P T 1/(λ cd c) is an unbiased estiator for the nuber of requests for the ite in the past T tie units. Denote by 1/(λ cd c) the expectation of 1/(dλ) over copies. We thus obtain that q i 1/(λ cd c) p i 1/(λ cd c) (1/A i) Hence, it suffices to choose C i with expected value C i (1/ 1/(λ cd c) ) 0.5 A 0.5 i. 6.3 Replication with probe eory We now consider a schee where each node records a suary of all recent 6 probes it had received: For each ite it had seen at least one probe for, the node records the total nuber of probes it received and the cobined size of the searches which these probes where part of (in fact, it is sufficient to use the search size up to the point that the recording node is probed). Interpret q i as the rate per node in which queries for ite i are generated. The probe rate of ite i, which is the rate in which a host in the network receives search probes for ite i, is then r i = q ia i, where A i = 1/(ρp i) is the size of the expected search size for i. Thus, intuitively, the query rate q i can be obtained (estiated) fro (estiates on) A i and r i. In order to estiate r i, which could be very low, with reasonable confidence, we can aggregate it across ultiple nodes. A natural choice for the size for this group, that allows the replication strategy to be integrated with the search process, is to aggregate along nodes on the search path. Consider now the aggregated rate in which nodes on a search path for i received search probes for i. The aggregated rate for a group of nodes of size A i is q ia 2 i = q i/(ρp i) 2. Observe that for Square-root allocation, this aggregated rate is fixed for all ites and is equal to ( È i qi) 2 /ρ 2 ties the rate in which nodes generate queries. This suggests a siple replication algorith: when searching for ite i, count the total nuber of recent probes for i seen on nodes in the search path. The count being lower than the threshold suggests that the ite is over-replicated (with respect to Square-root allocation). Otherwise, it is under-replicated. More generally, by carefully cobining inforation fro nodes on the search path, we can obtain high confidence estiates on A i and r i, that can lead to better estiates on q i. Let v be a node and let k v be the nuber of recent probes it had received for ite i. Let s j (j = 1,..., k v) be the search size of the jth probe seen by v for ite i. Let S v = È k v j=1 sj,7 Suppose that each node stores (k v, S v) for each ite. Let V be a set of nodes (say, the nodes encountered on a search for ite i.). We can estiate the probe rate by 6 We use the inforal ter recent for the a tie duration in which the allocation p i did not significantly change. Given this constraint, we would typically want the duration to be as long as possible. 7 Note that the expected value of s j is A i/2, since it is a Geoetrically distributed rando variable, we expect S v/k v to rapidly converge to A i/2 as k v grows. È ˆr i = v V kv/ V, and the expected search size by Â i = 2 v V S v/ v V k v. È È Since q i = r i/a i, we can use ( v V kv)2 /(2 V v V Sv) to estiate q i. We thus can use 2 V S v/ k v v V v V as a (biased) estiator for 1/ q i. È The accuracy of these estiates grows with v V kv. We earlier argued that È when the allocation is Square-root, the expectation of v V kv (when V are nodes on the search path for i) is fixed across ites. This iplies that it is desirable to aggregate statistics on probes for an ite over nuber of nodes that is proportional to the search size of the ite. The confidence level in the estiates can be increased by increasing the proportion factor. If an ite is over-allocated with respect to Square root, then the probe rate over the search path is low, in which case the estiates are not as accurate, but this only iplies that the ite should be replicated at a higher rate. If an ite is under-allocated, then the statistics collects ore probes and the estiates obtained have even larger confidence. 6.4 Obtaining the optial allocation The algoriths presented above converge to Square-root allocation, but now we discuss how they can be odified so that we obtain the optial allocation. For the purposes of the discussion, assue that we fix the set of locatable ites, that is, ites for which we want queries to be soluble. The optial algorith is a hybrid of a Unifor and a Square-root algoriths: For each locatable ite, the Unifor coponent of the replication algorith assigns a nuber of peranent copies (e.g., by noinating nodes that never delete it and are replaced when they go offline). The Square-root coponent, which can be based on any of the specific algoriths proposed above, generates transient copies as a followup on searches. The axiu search size is set together with the nuber of peranent copies so that ites with this iniu nuber of copies would be locatable. The value of these two dependent paraeters is then tuned according to the ix of soluble and insoluble queries to obtain the optial balance between the cost of soluble and insoluble queries. 6.5 Siulations We argued that all the distributed replication algoriths discussed here do indeed achieve the Square-root allocation, though with different rates of convergence and degrees of stability. We illustrate these convergence issued by siulating two of the proposed algoriths: path replication, and replication with sibling-nuber eory. The siulations track the fraction of nodes containing copies of a single ite in a network with 10K nodes. In our siulations, copies had a fixed lifetie duration and queries are issued in fixed intervals. Each search process continues until k copies are found (k {1, 5}). The nuber of new copies generated following each search is the search size for path replication, and the estiators discussed above for sibling-nuber eory. Soe of the siulations included delay between the tie a search is perfored and

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient