Techniques for peers to discover and use each other's functions are perhaps the greatest distinction between P2P technology and client/server Web technology. P2P technology expects peers to live at the edge of a network, and to require a variety of techniques to interoperate. On the other hand, client/server Web technology requires the network to know where to find resources before the request is made. P2P uses a group of methods known collectively as discovery.

Discovery

Discovery answers the big questions about a network:

What peers exist on the network?

How are the peers organized around their capabilities?

What uniquely identifies a peer?

How does a peer exchange data with another peer?

P2P is forced to identify answers to these questions. Unfortunately for the Java developer, not all P2P technologies are successful. Worse yet, some P2P technologies are closed and proprietary, or they hard-code implementations into one solution that would otherwise use open technology.

Although many P2P techniques exist to build peers, three types of peers have emerged as popular designs:

Simple peer

Rendezvous peers

Router peers

A simple peer is designed to be an endpoint that offers functions and data to peers making requests. Simple peers have the least responsibility of all three peer types. They usually reside outside a general network, and possibly behind a firewall or Network Address Translation (NAT) router. Simple peers are not expected to handle communication on behalf of other peers, or to serve information that they don't directly consume themselves.

Rendezvous peers provide a dating service in which peers discover other peers and peer resources like data and functions. All three types of peers issue discovery queries to rendezvous peers, but the rendezvous peer is also usually a cache of previous requests. When a rendezvous peer lives behind a firewall, it must have the ability to communicate through the firewall to other peers.

Router peers provide a mechanism for peers to communicate through firewalls and NAT routers. A router peer tunnels peer requests across a network. The information needed to use a router peer is enough to replace the need for a Dynamic Naming Service (DNS) and supports dynamic IP addressing.

Let's look at a simple example of the three peers in action. Imagine using a P2P client that looks for magazine articles on human genomics. The user initiates a search for the articles with a simple peer. The peer sends a discovery query to all its known simple peers and rendezvous peers. The rendezvous peers that receive the query look to see whether they have data the simple peer is looking for. If so, the rendezvous peer might return a discovery response message containing advertisements from other peers that are stored in its cache. The rendezvous peer will also likely send along the same query to its list of known peers.

Although we have described three different types of peers, in real-world P2P applications each peer might include a combination of the functions described in simple, rendezvous, and router peers. Let's look at how peers discover data, functions, and services using a variety of P2P techniques.

Router Peers and Dynamic Networks

P2P technology expects to find a network filled with firewalls, dynamic addresses, and changing peer locations. P2P provides a loose coupling of peers, so the P2P network remains functional even when parts of the real network break. Three P2P discovery techniques have become popular in this environment:

BroadcastSends a discovery request to every network node that is reachable

Selective broadcastSends a discovery request to every network node based on established heuristics

Adaptive broadcastSends a discovery request to every network node based on heuristics and rules

These techniques will be joined, modified, and abandoned over time as new ways to dynamically form a network are identified. The following are some of the areas of study from which P2P technology innovations might spring:

TransportHow do transport services such as broadcast, multicast, and unicast messaging relate to discovery?

RadiusHow is the discovery horizon established and maintained?

Frequency of broadcastHow often should discovery messages be broadcast to populate the network?

Discovery protocolWhat information should be defined in a discovery protocol?

Discovery rolesDo all peers participate equally in the discovery process? Do all peers have the same broadcast role?

Broadcasts

Traditionally, broadcast messages have been sent by devices that deal with
network routing or data packet exchange at a low level, such as routers. Broadcast
messages on IP networks contain a special address reserved for broadcasting.
The network and host part of the address is set to ones (hex:FFFFFFFF).
This indicates to the network layer that the packet is addressed to every device
on the subnet, as seen in Figure 6.1.

Figure 6.1
Broadcasts try to reach all nodes on the subnet.

In a P2P context, broadcasting might sound like TCP/IP multicasting, but it isn't. P2P technology plays mostly in the application layer of a software application. The actual method for moving a broadcast message across the Internet might use multicasting or a number of other techniques that we will explore next.

TransportMulticast Versus Unicast Messaging

Multicast messaging is often compared to radio or TV broadcasts, in the sense that only those who have tuned their receivers to a particular frequency receive the information. Only the channels selected are heard. The sender sends the information without knowledge of the number of receivers.

In contrast, when you send a packet and there is only one sender and one recipient, this is referred to as unicast. A unicast transmission is by definition point-to-point. Unicast can be used to send identical information to many different destinations; however, this involves replicating data, and is not the most efficient transport.

Multicast addresses are in the Class D 224239 range. Multicast messaging uses this range of addresses to define multicast groups, as shown in Table 6.1.

Multicasting has produced mixed results in applications that require a number
of machines in a distributed group to receive the same data, such as conferencing,
group mail, news distribution, and network management. Multicasting suffers
from the lack of a control protocol, which makes it unsuitable for large, reliable,
and sustained transmissions. Multicasting appears to be well-suited to P2P because
peers on a P2P network do not require the synchronization of data among the
peers, as multicasting often fails to deliver 100% of its data to everyone listening
to the multicast. Figure 6.2 shows multicasting
being used in P2P networks for discovery.

Decreased network utilizationReduces the number of messages required by eliminating redundant packets and decreasing the number of point-to-point connections that must be established.

Resource discoveryDiscovery and multicasting assume a sender is transmitting to an unknown number of peers without knowledge of their location.

Dynamic participationMulticasting provides flexibility in joining and leaving a group. This membership flexibility supports the transient behavior of peers.

Multimedia supportMultimedia transmission continues to increase in popularity and consumes a significant amount of bandwidth. This is one area where network optimization is of paramount importance. Multicasting can be used to transmit multimedia data to receiving stations that compress the transmission and then deliver it to destination nodes, rather than using point-to-point connections for all destinations.

Unfortunately, multicasting is not implemented everywhere. Hardware, specifically routers, often block multicast traffic from penetrating corporate networks or traversing ISP providers. Firewalls and NAT devices often block not only multicast traffic, but constrain traffic in general to well-controlled choke points (ports). As a result, additional means of discovery are generally required in scalable P2P networks.