Abstract

Peer-to-Peer (P2P) computing systems offer many advantages of decentralized distributed systems but suffer from availability and reliability. In order to increase availability and reliability, data replication techniques are considered commonplace in P2P computing systems. Replication can be seen as a family of techniques. Full documents or just chunks can be replicated. Since the same data can be found at multiple peers, availability is assured in case of peer failure. Consistency is a challenge in replication systems that allow dynamic updates of replicas. Fundamental to any of them is the degree of replication (full vs. partial), as well as the source of the updates and the way updates are propagated in the system. Due to the various characteristics of distributed systems as well as system’s and application’s requirements, a variety of data replication techniques have been proposed in the distributed computing field. One important distributed computing paradigm is that of P2P systems, which distinguish for their large scale and unreliable nature. In this chapter we study some data replication techniques and requirements for different P2P applications. We identify several contexts and use cases where data replication can greatly support collaboration. This chapter will also discuss existing optimistic replication solutions and P2P replication strategies and analyze their advantages and disadvantages. We also propose and evaluate the performance of a fuzzy-based system for finding the best replication factor in a P2P network.