AT&T Researchers — Inventing the Science Behind the Service

To demonstrate the ability of network science to predict the behavior of large-scale complex networks, network scientists are applying their techniques to the Internet.

Network science is a brand-new discipline that relies on large datasets and computational power to study large-scale networks and to find properties that are common to them. These networks might be biological, technological (power grids, communication networks including the Internet), or social, but the premise of network science is that they share essential features that can be used to predict their behavior.

The immediately apparent similarity among networks, regardless of their own unique complexity, is their high-level topology: nodes connected to other nodes via links over which information or data is passed. This distilled topology is easily represented as a graph, and for these graphs network scientists, using tools from mathematics and techniques from statistical physics, build models to predict the networks’ behavior.

Network science applied to the Internet

In the case of the physical Internet, nodes are devices such as routers and switches. The physical connectivity of these nodes has to be inferred from measurements since the links between the nodes cannot in general be directly inspected. The data that network science has relied on came from an early study in which traceroute, a widely used tool for determining the path of a packet through a network, was used to “get some experimental data on the shape of multicast trees.” Although traceroute was not originally intended for such a purpose and the original data collectors were very much aware of their data’s limitations, their study represented an improvement over traditional approaches because of their use of actual data, incomplete as it was.

When the traceroute data was used to infer the physical Internet, what stood out when graphing the data was that a few nodes had many connections, while most nodes had only a few. This observation is a hallmark of power-law node degree distributions where the frequency of an event decreases very slowly as the size of the event increases. For example, very large earthquakes occur only very rarely, but small earthquakes occur often. Power law distributions are especially popular with physicists for whom they often confer the ability to make predictions.

The finding of power law node degree distributions for the Internet came as a big surprise and didn’t fit with the classical random graph models that have been studied by mathematicians during the past 50 years; these models are unable to capture the observed high variability in node degrees. To account for this new power law phenomenon in graphs representing real-world complex networks, network scientists developed novel graph models capable of reproducing the observed power law node degree distributions.

Scale-free models, the term applied to these new models, gained further legitimacy when mathematicians placed the physicists’ largely empirical results on solid grounds by providing rigorous proofs. The new models also carried new implications for the robustness of the Internet—since they predict the presence of highly connected nodes (or hubs) in the core of the network; the existence of such hubs means the Internet is vulnerable to attacks concentrated on these critical nodes. This is the much-publicized Achilles heel of the Internet.

Examining the data

But do the measurements support the claims made by network science regarding the Internet?

Since the network science approach and conclusions are almost entirely data-driven, this question cannot be answered unless the underlying data—the traceroute data—is rigorously examined; something that was not done at the beginning.

The problems start with the use of traceroute to describe the Internet’s physical structure.

Firstly, traceroute operates at the IP layer, not the physical layer, and therefore cannot accurately describe the Internet’s physical topology.

The Internet from an engineering standpoint is not a single topology but many. Where users and traceroute “see” a simple layout of nodes (routers) linked together, an engineer sees a stack of layers, each of which performs a particular function using its own specific protocols. This stacked, or layered, architecture is one reason for the great success of the Internet. Applications running at one layer don’t need to know anything about other layers. And changes at one layer have no impact on other layers.

But there are other just as substantial problems with traceroute and thus the data collected with it:

traceroute can’t distinguish between a single device (router) and an entire network made up of hundreds of devices.
ISPs are making more use of MPLS (multiprotocol label switching) since it often makes network management easier. MPLS operates at layer 2 but traceroute is strictly limited to the IP layer (layer 3) with the result that traceroute cannot trace through opaque layer-2 clouds; upon encountering a layer-2 cloud, traceroute reports a high-degree node.
Thus a single node detected by traceroute might in fact represent hundreds of nodes. Reports of high-degree hubs in the center of the Internet turn out to be artifacts of traceroute’s inability to distinguish between a high-level node and an entity composed of hundreds of devices.

A known problem of traceroute is the IP alias resolution problem. Routers typically support multiple interfaces, with each interface having a different IP address. Packets following the same physical link between two nodes appear to have different paths if they use different interfaces. Although techniques are being developed to overcome this problem, the IP alias problem remains an impediment to accurately mapping the true physical infrastructure.

traceroute cannot detect high-volume nodes at the edge of the network, which is exactly where high-degree nodes are found, if they exist at all.
For both technological and economic reasons, high-degree routers are located at the edge of the network where the technology exists to multiplex a large number of relatively low-bandwidth links. (There’s an inherent tradeoff in router configuration: a router can support either a few high-throughput connections or many low-throughput connections.)
But high-degree nodes at the edge of the network (e.g., a DSLAM or DSL access multiplexer) are not detectable by generic large-scale traceroute experiments since they lack participation of a sufficient number of local end systems.
Thus the irony is that the high-degree nodes detected by traceroute in the network center are not real (but instead entire layer-2 clouds), and the real-life, high-degree nodes are at the edge of the network where they cannot be detected by traceroute.

A reverse-engineering approach

Given that the traceroute data is largely inadequate to infer the physical Internet, a different, more realistic approach is needed to analyze the Internet. A more grounded approach would be to reverse-engineer the decisions made by ISPs and network planners when designing the actual physical infrastructure.

Engineers don’t design networks with power laws or any other mathematical conscript in mind, but rather by figuring out the most efficient way to get the anticipated traffic from one place to another within the parameters of what is feasible and cost-effective. In the language of mathematical modeling, they try to solve, at least heuristically, a constrained optimization problem. In particular, the selection of links between nodes is anything but random, and therein lies the main difference between the network scientists’ scale-free models and an engineer’s approach.

When feasibility and economic constraints are considered, it makes sense to place high-degree nodes at the edge of the network where ISPs multiplex their customers’ traffic before sending it towards the backbone. The view that emerges through reverse-engineering—high-degree nodes at the network edge with low-degree (though high-capacity) nodes at the core—directly conflicts with the scale-free models, which place the highly connected nodes at the network core.

In short, the scale-free modeling approach for the physical Internet collapses under scrutiny of the data and when viewed from an engineering perspective. Physical devices in the Internet can and do fail; for this reason, engineers built in redundancy and designed routing protocols with the ability to bypass nonworking devices for working ones. This system has worked very well, and the robustness of the Internet to router or link failures has exceeded anyone’s expectation as the network has grown from a handful of nodes to millions of nodes over a 40-year span.

Toward a more realistic model

The current problems with network science, at least how it applies to the Internet, is that network science has depended on inaccurate, incomplete data, has produced models from the data that conflicts with reality, and has paid little or no attention to model validation.

Still, network science may have a role to play if it learns the lessons from being carelessly applied to the Internet. A relevant mathematical theory is needed for correcting the Internet’s real shortcoming: that of the trust model on which the system was originally designed. This model has been broken for some time; viruses, worms, and spam are the evidence, but the more serious threat is that the critical protocols (e.g., BGP) that ensure the Internet’s viability could be hijacked to do real damage.

Currently network researchers are working with engineering insights, but as the Internet scales ever larger, there is more urgency for a relevant mathematical theory to aid and ultimately replace engineering intuition.

But any new model would have to build on rigorously vetted data and incorporate domain knowledge. If these steps are taken, a more nuanced and true-to-life framework could be developed and used for predicting the behavior of tomorrow’s Internet.

Author's Biography:

Walter Willinger, a member of the Information and Software Systems Research Center at AT&T Labs Research in Florham Park, NJ, has been a leading researcher into the self-similar ("fractal") nature of Internet traffic. His paper "On the Self-Similar Nature of Ethernet Traffic" is featured in "The Best of the Best - Fifty Years of Communications and Networking Research," a 2007 IEEE Communications Society book compiling the most outstanding papers published in the communications and networking field in the last half century. More recently, he has focused on investigating the topological structure of the Internet and on developing a theoretical foundation for the study of large-scale communication networks such as the Internet.