Tools

"... This paper addresses the problem of churn---the continuous process of node arrival and departure---in distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then ..."

This paper addresses the problem of churn---the continuous process of node arrival and departure---in distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then show through experiments on an emulated network that current DHT implementations cannot handle such churn rates. Next, we identify and explore three factors affecting DHT performance under churn: reactive versus periodic failure recovery, message timeout calculation, and proximity neighbor selection. We work in the context of a mature DHT implementation called Bamboo, using the ModelNet network emulator, which models in-network queuing, cross-traffic, and packet loss. These factors are typically missing in earlier simulationbased DHT studies, and we show that careful attention to them in Bamboo&apos;s design allows it to function effectively at churn rates at or higher than that observed in P2P file-sharing applications, while using lower maintenance bandwidth than other DHT implementations.

... vary. DHTs are structured graphs, and we use the term geometry to mean the pattern of neighbor links in the overlay network, independent of the routing algorithms or state management algorithms used =-=[12]-=-. Each node in Pastry is assigned a numeric identifier in [0,2 160 ), derived either from the SHA-1 hash of the IP address and port on which the node receives packets or from the SHA-1 hash of a publi...

This paper presents the design of Mercury, a scalable protocol for supporting multi-attribute rangebased searches. Mercury differs from previous range-based query systems in that it supports multiple attributes as well as performs explicit load balancing. Efficient routing and load balancing are implemented using novel light-weight sampling mechanisms for uniformly sampling random nodes in a highly dynamic overlay network. Our evaluation shows that Mercury is able to achieve its goals of logarithmic-hop routing and near-uniform load balancing. We also show that a publish-subscribe system based on the Mercury protocol can be used to construct a distributed object repository providing efficient and scalable object lookups and updates. By providing applications a range-based query language to express their subscriptions to object updates, Mercury considerably simplifies distributed state management. Our experience with the design and implementation of a simple distributed multiplayer game built on top of this object management framework shows that indicates that this indeed is a useful building block for distributed applications. Keywords: Range queries, Peer-to-peer systems, Distributed applications, Multiplayer games 1

...table designs [11, 23, 24, 26] have been proposed over the past few years. They provide a hash table interface to the application, viz., insert(key, value) and lookup(key) primitives. Recent research =-=[4, 10]-=- has shown that, in addition to the basic scalable routing mechanism, DHTs offer much promise in terms of load balancing, proximity-based routing, static resilience, etc. Hence, it is a natural questi...

"... In the span of only a few years, the Internet has experienced an astronomical increase in the use of specialized content delivery systems, such as content delivery networks and peer-to-peer file sharing systems. Therefore, an understanding of content delivery on the Internet now requires a detailed ..."

In the span of only a few years, the Internet has experienced an astronomical increase in the use of specialized content delivery systems, such as content delivery networks and peer-to-peer file sharing systems. Therefore, an understanding of content delivery on the Internet now requires a detailed understanding of how these systems are used in practice. This paper examines content delivery from the point of view of four content delivery systems: HTTP web traffic, the Akamai content delivery network, and Kazaa and Gnutella peer-to-peer file sharing traffic. We collected a trace of all incoming and outgoing network traffic at the University of Washington, a large university with over 60,000 students, faculty, and staff. From this trace, we isolated and characterized traffic belonging to each of these four delivery classes. Our results (1) quantify the rapidly increasing importance of new content delivery systems, particularly peerto-peer networks, (2) characterize the behavior of these systems from the perspectives of clients, objects, and servers, and (3) derive implications for caching in these systems. 1

"... Peer-to-peer (P2P) systems, which provide a variety of popular services, such as file sharing, video streaming and voice-over-IP, contribute a significant portion of today’s Internet traffic. By building overlay networks that are oblivious to the underlying Internet topology and routing, these syste ..."

Peer-to-peer (P2P) systems, which provide a variety of popular services, such as file sharing, video streaming and voice-over-IP, contribute a significant portion of today’s Internet traffic. By building overlay networks that are oblivious to the underlying Internet topology and routing, these systems have become one of the greatest traffic-engineering challenges for Internet Service Providers (ISPs) and the source of costly data traffic flows. In an attempt to reduce these operational costs, ISPs have tried to shape, block or otherwise limit P2P traffic, much to the chagrin of their subscribers, who consistently finds ways to eschew these controls or simply switch providers. In this paper, we present the design, deployment and evaluation of an approach to reducing this costly cross-ISP traffic without sacrificing system performance. Our approach recycles network views gathered at low cost from content distribution networks to drive biased neighbor selection without any path monitoring or probing. Using results collected from a deployment in BitTorrent with over 120,000 users in nearly 3,000 networks, we show that our lightweight approach significantly reduces cross-ISP traffic and, over 33 % of the time, it selects peers along paths that are within a single autonomous system (AS). Further, we find that our system locates peers along paths that have two orders of magnitude lower latency and 30 % lower loss rates than those picked at random, and that these high-quality paths can lead to significant improvements in transfer rates. In challenged settings where peers are overloaded in terms of available bandwidth, our approach provides 31% average download-rate improvement; in environments with large available bandwidth, it increases download rates by 207 % on average (and improves median rates by 883%).

"... We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary ..."

We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary views of global information. To serve as a basic building block, a SDIMS should have four properties: scalability to many nodes and attributes, flexibility to accommodate a broad range of applications, administrative isolation for security and availability, and robustness to node and network failures. We design, implement and evaluate a SDIMS that (1) leverages Distributed Hash Tables (DHT) to create scalable aggregation trees, (2) provides flexibility through a simple API that lets applications control propagation of reads and writes, (3) provides administrative isolation through simple extensions to current DHT algorithms, and (4) achieves robustness to node and network reconfigurations through lazy reaggregation, on-demand reaggregation, and tunable spatial replication. Through extensive simulations and micro-benchmark experiments, we observe that our system is an order of magnitude more scalable than existing approaches, achieves isolation properties at the cost of modestly increased read latency in comparison to flat DHTs, and gracefully handles failures.

"... mechanism to estimate Internet network distance (i.e., round-trip delay or network hops). Network distance estimation is important in many applications, for example, network-aware overlay construction and server selection. There are several proposals for distance estimation in the Internet but they ..."

mechanism to estimate Internet network distance (i.e., round-trip delay or network hops). Network distance estimation is important in many applications, for example, network-aware overlay construction and server selection. There are several proposals for distance estimation in the Internet but they all suffer from problems that limit their benefit. Most rely on a small set of infrastructure nodes that are a single point of failure and limit scalability. Others use sets of peers to compute coordinates but these coordinates can be arbitrarily wrong if one of these peers is malicious. While it may be reasonable to secure a small set of infrastructure nodes, it is unreasonable to secure all peers. PIC addresses these problems: it does not rely on infrastructure nodes and it can compute accurate coordinates even when some peers are malicious. We present PIC&apos;s design, experimental evaluation, and an application to network-aware overlay construction and maintenance.

...ber of overlay hops and increases the stress in the underlying network links. There are several techniques for proximity-aware routing proposed in the literature [20, 30, 21, 24, 12, 23]. Recent work =-=[3, 7, 11]-=- identifies proximity neighbor selection (PNS) as the most promising technique. Tapestry, Pastry [24], and a recent version of Chord [11] implement PNS. PNS can be used to achieve low delay routes and...

"... The performance of peer-to-peer file replication comes from its piece and peer selection strategies. Two such strategies have been introduced by the BitTorrent protocol: the rarest first and choke algorithms. Whereas it is commonly admitted that BitTorrent performs well, recent studies have propose ..."

The performance of peer-to-peer file replication comes from its piece and peer selection strategies. Two such strategies have been introduced by the BitTorrent protocol: the rarest first and choke algorithms. Whereas it is commonly admitted that BitTorrent performs well, recent studies have proposed the replacement of the rarest first and choke algorithms in order to improve efficiency and fairness. In this paper, we use results from real experiments to advocate that the replacement of the rarest first and choke algorithms cannot be justified in the context of peer-to-peer file replication in the Internet. We instrumented a BitTorrent client and ran experiments on real torrents with different characteristics. Our experimental evaluation is peer oriented, instead of tracker oriented, which allows us to get detailed information on all exchanged messages and protocol events. We go beyond the mere observation of the good efficiency of both algorithms. We show that the rarest first algorithm guarantees close to ideal diversity of the pieces among peers. In particular, on our experiments, replacing the rarest first algorithm with source or network coding solutions cannot be justified. We also show that the choke algorithm in its latest version fosters reciprocation and is robust to free riders. In particular, the choke algorithm is fair and its replacement with a bit level tit-for-tat solution is not appropriate. Finally, we identify new areas of improvements for efficient peer-to-peer file replication protocols.

...ve version was published in Proc. of IMC’06, October 25–27, 2006, Rio de Janeiro, Brazil. for this success. Whereas content localization has attracted considerable research interest in the last years =-=[7,12,23,25]-=-, content replication has started to be the subject of active research only recently. As an example, the most popular peer-to-peer file sharing networks [1] eDonkey2K, FastTrack, Gnutella, Overnet foc...

"... This paper examines graph-theoretic properties of existing peer-to-peer architectures and proposes a new infrastructure based on optimal-diameter de Bruijn graphs. Since generalized de Bruijn graphs possess very short average routing distances and high resilience to node failure, they are well suite ..."

This paper examines graph-theoretic properties of existing peer-to-peer architectures and proposes a new infrastructure based on optimal-diameter de Bruijn graphs. Since generalized de Bruijn graphs possess very short average routing distances and high resilience to node failure, they are well suited for structured peer-to-peer networks. Using the example of Chord, CAN, and de Bruijn, we first study routing performance, graph expansion, and clustering properties of each graph. We then examine bisection width, path overlap, and several other properties that affect routing and resilience of peer-to-peer networks. Having confirmed that de Bruijn graphs offer the best diameter and highest connectivity among the existing peer-to-peer structures, we offer a very simple incremental building process that preserves optimal properties of de Bruijn graphs under uniform user joins/departures. We call the combined peer-to-peer architecture