Bookmark

OpenURL

Abstract

Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives. Due to scalability reasons, each participant in a gossip protocol maintains a partial view of the system. The reliability of the gossip protocol depends upon some critical properties of these views, such as degree distribution and clustering coefficient. Several algorithms have been proposed to maintain partial views for gossip protocols. In this paper, we show that under a high number of faults, these algorithms take a long time to restore the desirable view properties. To address this problem, we present HyParView, a new membership protocol to support gossip-based broadcast that ensures high levels of reliability even in the presence of high rates of node failure. The HyParView protocol is based on a novel approach that relies in the use of two distinct partial views, which are maintained with different goals by different strategies. 1

Citations

...d virus that may take down all machines running a specific OS version (that may represent a significant portion of the system). For instance, a worm could affect 10.000.000 nodes in the space of days =-=[13]-=-; also, these worms can spread in a first phase and take down nodes simultaneously at a predetermined time. The rest of the paper is structured as follows. Section 2 offers an overview of related work...

...ith different goals by different strategies. 1 Introduction Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives =-=[8, 3, 6, 1]-=-. In a gossip protocol, when a node wants to broadcast a message, it selects t nodes from the system at random (this is a configuration parameter called fanout) and sends the message to them; upon rec...

...type of approach, a partial view only changes in response to some external event that affects the overlay (e.g. a node joining or leaving). In stable conditions, partial view remains unaltered. Scamp =-=[5, 4]-=- is an example of such an algorithm 2 . Cyclic strategy: In this type of approach, a partial view is updated every ∆T time units, as a result of some periodic process that usually involves the exchang...

...lly involves the exchange of information with one or more neighbors. Therefore, a partial view may be updated even if the global system membership is stable. Cyclon is an example of such an algorithm =-=[15, 14]-=-. Reactive strategies rely on some failure detection mechanism to trigger the update of partial views when a node leaves the system. If the failure detection mechanism is fast and accurate, reactive m...

...all subset of the entire system membership. When a node performs a gossip step it selects t nodes at random from its partial view. The aim of a membership service (also called a peer sampling service =-=[7]-=-) is to maintain these partial views satisfying a number of good properties. Intuitively, selecting gossip peers from the partial view should provide the same resiliency as selecting them at random fr...

...ith different goals by different strategies. 1 Introduction Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives =-=[8, 3, 6, 1]-=-. In a gossip protocol, when a node wants to broadcast a message, it selects t nodes from the system at random (this is a configuration parameter called fanout) and sends the message to them; upon rec...

...type of approach, a partial view only changes in response to some external event that affects the overlay (e.g. a node joining or leaving). In stable conditions, partial view remains unaltered. Scamp =-=[5, 4]-=- is an example of such an algorithm 2 . Cyclic strategy: In this type of approach, a partial view is updated every ∆T time units, as a result of some periodic process that usually involves the exchang...

...lly involves the exchange of information with one or more neighbors. Therefore, a partial view may be updated even if the global system membership is stable. Cyclon is an example of such an algorithm =-=[15, 14]-=-. Reactive strategies rely on some failure detection mechanism to trigger the update of partial views when a node leaves the system. If the failure detection mechanism is fast and accurate, reactive m...

...ach values of reliability above the 99%. In this run, with a fanout of 6, there are potentially 20.000 extra messages exchanged than in a scenario that uses a fanout of 4 (by the results presented in =-=[4]-=-, this fanout should ensure a reliability between 98% and 99%). More than 99% of these 20.000 extra messages are redundant, which means that less than 200 of these messages will, in fact, contribute t...

... view, node q increases the probability of shuffling q with other nodes and, subsequently, having p be target of Neighbor requests. 85 Evaluation We conducted simulations using the PeerSim Simulator =-=[9]-=-. We have implemented both HyParView, Cyclon and Scamp in this simulator in order to get comparative figures. In order to validate our implementation of Cyclon and Scamp, we have compared the results ...

...ith different goals by different strategies. 1 Introduction Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives =-=[8, 3, 6, 1]-=-. In a gossip protocol, when a node wants to broadcast a message, it selects t nodes from the system at random (this is a configuration parameter called fanout) and sends the message to them; upon rec...

...arge number of nodes that may constitute the view, but also due to the cost of maintaining the complete membership up-to-date. To overcome this problem, several gossip protocols rely on partial views =-=[13, 2, 3]-=- instead of the complete membership information. A partial view is a small subset of the entire system membership. When a node performs a gossip step it selects t nodes at random from its partial view...

...large number of nodes that may constitute the view but also due to the cost of maintaining the complete membership up-to-date. To overcome this problem, several gossip protocols rely on partial views =-=[11, 2, 3]-=- instead of the complete membership information. A partial view is a small subset of the entire system membership. When a node performs a gossip step it selects t nodes at random from its partial view...

...large number of nodes that may constitute the view but also due to the cost of maintaining the complete membership up-to-date. To overcome this problem, several gossip protocols rely on partial views =-=[11, 2, 3]-=- instead of the complete membership information. A partial view is a small subset of the entire system membership. When a node performs a gossip step it selects t nodes at random from its partial view...

...sing a variation of the techniques proposed in [12], which simply considers slow nodes as having failed, and expels them from all active views. A detailed description of the mechanism can be found in =-=[10]-=-. 6. Conclusions and Future Work Gossip protocols are appealing because they work on overlays that have very small maintenance cost. Therefore, they seem obvious candidates to support applications tha...