CORRECTNESS OF A GOSSIP BASED MEMBERSHIP PROTOCOL BY (ANDRÉ ALLAVENA, ALAN DEMERS, JOHN E. HOPCROFT ) PRATIK TIMALSENA UNIVERSITY OF OSLO

OUTLINE q Contribution of the paper q Gossip algorithm q The corrected Gossip algorithm q Maintenance of the local view q Model and definition q Simulation and result q Summary and discussion 2

Contribution of the paper q scalable gossip-based algorithm for local view maintenance. q Can be combined with any application level gossip protocol that relies on randomly selected gossip partners q Preserve connectivity and load balancing between nodes 3

GOSSIP ALGORITHM q Practical example when gossip algorithm required q catering example q Catering people stock on snow storm q Party without food is boring q How to pass message to all to bring food for party q One option calling tree 4

GOSSIP ALGORITHM 5

GOSSIP ALGORITHM q Gossip based membership q Motivation Importance of scalability and fault-tolerance in distributed system Has led to considerable research in multicast protocols using gossip q Each node forwards a message to set of gossip targets q Probabilistic guarantees of delivery. q Reliability By setting amount of gossip targets large 6

GOSSIP ALGORITHM q Generic protocol q At every round, each node u: q selects at random f nodes from its current view; q contacts these f nodes and requests their views; q concatenates these views into a list L; q adds to its list L the nodes that request u s view during the round, if any; q adds to its list L the views of the nodes that request u s view during the round, if any; q creates its new view by selecting k elements from the list L, not allowing duplicates. 7

GOSSIP ALGORITHM 8

GOSSIP ALGORITHM q Generic protocol q Potential issue q The network could break into two or more disconnected components. no links between components; q The network could become unbalanced,( swamping a few nodes with a very high load instead of spreading the load more or less evenly) q The views could not mix well, and the system at time t = O(ln n) would not be independent from the system at time t = 0. 9

GOSSIP ALGORITHM q Variant gossip algorithm(corrected one) q Desirable properties q Load balancing(reinforcement, probabilistic bound on degree of node) q Scalability (local view) q Avoiding portioning(even distribution of the pointer to the nodes) 10

GOSSIP ALGORITHM CORRECTED q How it works q Protocol is based on each node having a local view l Fixed-size random subset of group membership q N number of nodes q K the size of a local view q F fanout parameter q W reinforcement weight q local view of random node 11

GOSSIP ALGORITHM CORRECTED 12

GOSSIP ALGORITHM CORRECTION q construct a list L1 comprising the local views of f nodes chosen at random from the local view of s q construct a list L2 of the other nodes that requested its view during the round q create a new local view by choosing k distinct elements at random from L1 andl2 q Reinforcement weight ω determines how much more likely nodes are to be selected from L2 than L1. q If ω is 0, nodes in L2 are ignored q if ω is 1, we make no distinction between L1 and L2 q In the limit ω goes to, all nodes are taken from L2 if possible.

GOSSIP ALGORITHM CORRECTION q Join and leave q Node join by copying the local view of random node q Leave by stopping to participate leaving the dangled age in the graph q No use of keep alive and heartbeat message q No difference between stopping and failing

Explanation of the functioning of the protocol q The view requested by u q The node that requested u s view q Mixing, pulling the view from several nodes q The node u pulls the node that request its view in the list where it creates the new view q Adding its name to the list(pool of names) rather than information of view q Provides the balance q Without reinforcement network collapse like star graph 15

PARTITIONING AND ESTIMATE SIZE q Consider partition set A and B q x fraction edge from A nodes to A q y fraction edge from B to A q γ = A /n fraction of A nodes q If edges are drawn normally Then the edges from A or B to A is proportional to normal size Of A q x =30% y=70% set S q x =y q x-y = 1 16

EVOLUTION OF ESTIMATE OF SIZE OF A q Mixing q Act of pulling and merging the views corresponding to checking nodes from both sides for their estimate and averaging them q Converges fast at x = y q x=0, y=1 q Provides graphs not partitioned q Provides necessary condition for reinforcement to drag to correct value of the size 17

EVOLUTION OF ESTIMATE OF SIZE OF A q Reinforcement q Both side agreeing on their estimate of size of A q Some node from A and some from B are going to pull A nodes q At what proportion? q Same proportion as the size of A and B q x=y=30% q A nodes pull 30% from A and so does B nodes q A sees 30% of A nodes and B as well q Removes older/dead edges in approximately K/F rounds q adds fresh edges q Thus brings the estimate of size of A to correct value q Adds a little bit to estimation at each iteration leads to correct value

MODEL AND DEFINITION q Non partitioning q Expected time before a fraction γ of the nodes partiton away from the rest of the nodes is in exponential in γkn. q k is size of the view q n is number of nodes in the network q Above facts says that only small component can be disconnected and rest stays connected q Disconnected can easily detected by observing diversity in the content value over time q And attempted to rejoin q View size needs to be larger than churn rate for expected time until partition to be exponential

MODEL AND DEFINITION q Model intuition q Each iteration node v q Each node has some number of edges pointing to nodes in A, and some number of edges pointing to nodes in B q a node s edge distribution to be independent of that of its neighbors. q In u v A is like x A

MODEL AND DEFINITION q Model q If the element being updated is a 0, the new value is the value of a position chosen uniformly at random from A. q If it is a 1, the value is the value of a position chosen uniformly at random from B. q for reinforcement, if the node doing the tagging is in A, put a zero, otherwise, a one.

MODEL AND DEFINITION

MODEL AND DEFINITION q Limit probability distribution q The limit probability distribution is 1 for the partitioned state (a = γkn, b = 0), and 0 for the rest of the grid q add a transition out of the partitioned state to the two neighbor states(a=γkn 1,b=0 ) and(a=γkn,b=1). q Claim q the fraction of time spent in the partitioned state of this finite (2-dimensional) Markov Chain is exponentially small in γkn when µ γkn and p 1/ ln γkn

Simplified Model q Array Υ of length kn (concatenation of the view) q Assumed no nodes leaves or join q At each iteration randomly chosen element replaced by q Probability p picked node uniformly at random q With probability 1-p copy value of the element chosen uniformly at random q When reinforcing a node replaces elements in view by a node randomly chosen from list of nodes in the system q When pulling by arbitrary view. 24

Simplified Model q No difference between picking from a view from A or B( distribution governing the edge distribution is the same for all nodes ) q Justified when agreed on the partition size estimate q the conditioning on neighbors to be negligible 25

Simplified Model q Without reinforcement q P=0 in simplified model q Without reinforcement nodes disappear quickly from union each round q Cannot bring back without reinforcement q The diversity of content of the Υ decreases and the network collapse into star like graph 26

Simplified Model q With reinforcement q Can solve for the limit distribution q Find correct value for the estimate of the size of the partition. q Analysis shows both side once agreed on estimate of the size of the partition q Estimate converge to correct value and stay there(network balance) 27

Simulation Results q Simulations show that performance of protocol is good q For large scale test with up to 100 000 nodes, view size 17, fanout 3 q Almost as good as a random graph q Maximum in-degree always below 4.5 times that of random graph q High number of rounds before any partitioning happens q Scales well with increasing number of nodes 28

Simulation Results 29

Alternative protocol q Several choices made in the design of the algorithm: push, pull, randomize etc. q Explore some alternative and cite alternative work q Doesn t distinguish between pull and reinforcement. q Reinforcement q Used push method q Add actively one s name to the a (multi)set defined by concatenation view(υ) q To compensate for the drop due to randomness in the pulling part or node failure q Alternative pulling method q Adding node u to one s view is useful if not present already q The nodes which cannot contact u can be the case 30

Alternative protocol q Mixing q Used pull mechanism q Mixing of the views q Push pull and push-pull are alternatives q Pull is better according to author q Random walking on the directed graph leads to the view of the node (2^t) q In push random walk not directed and can be back tracked and increases the risk of portioning q Randomize q Instead of making all choices random, use timestamp for every view entry q Randomly select view on the the timestamp q Old edge replaced by fresh edges q CYCLON similar but uses the timestamp 31

SUMMARY AND DISCUSSION q Satisfactory protocol for local view maintenance q Provide the way to preserve connectivity q Provides the network balance q Only simulation is done and the real implementation 32