A Single Company Oversees 30% of the Global Web Traffic, and It’s Not the One You Thought

There's an ongoing conversation about how the growing power of Internet giants like Facebook, Google, and Amazon in their respective niches affects people and societies at large. What if we look at the more basic layers of the Web instead?

There’s been much debate lately around the so-called Internet giants and how their growing power in their respective niches affects people and societies at large. Facebook has been claimed to be able to sway presidential elections, Google has been accused of “adjusting” search results to meet their goals – and at the same time failing to do so in order to protect facts from lies – while Amazon has been munching away at the price margins of its competitors (read – any other retailer) to the degree that entire market niches break a sweat on the first mention of a new competing offer from the e-commerce giant.

These companies have accumulated so much market power in the new networked economy that a new word has been coined to refer to this creepily potent group: the Trinet – hinting on the ongoing reduction of the internet into an oligopolistic combination of those three players. However, we’re still talking about high-level services here: mainly social media for Facebook, search for Google, and retail for Amazon. What will we see if we look at the more basic layers of the Web?

The Internet: Under the Hood

While the Trinet is increasingly becoming a one-stop-shop for providing end-user utility, there’s an important step in-between someone coming up with an online service – be it knowledge, entertainment, shopping or socialization – and you actually receiving it: content delivery. Moving data around the globe in a fast and secure fashion can be hard work, especially if you have growing heaps of it. This is why most of the top companies don’t actually do it themselves – instead, they elect to outsource this task to specialized providers called content delivery networks, or CDN.

The name is quite self-explanatory: simply speaking, a CDN is a network of servers that is geographically spread out to ensure the shortest possible route to each end user, depending on their physical location. In practice, this means that content from the origin is first copied to several independent data centers across the world (also called points of presence or PoP) and then delivered to the final consumers from these PoP’s without having to bother the origin server every time someone wants to request that data. Apart from higher page speed, additional benefits of such setup typically include better load balancing as well as higher resilience to traffic spikes and malicious attacks.

This might not sound too exciting, before you realize that the vast majority of web content you view and interact with every day comes to you via CDNs. If major online services are the heart of the Internet, content delivery networks are the vast circulatory system that delivers blood to the oxygen-hungry individual cells across the body.

As Big as It Gets

If the last analogy wasn’t intense enough, here’s a fun fact: at any point in time, nearly a third of the world’s traffic, that Internet “lifeblood”, passes through a single content delivery network. It’s operated by a corporation called Akamai Technologies, which, according to self-reported figures, consistently delivers ca 30% of the entire world’s traffic. Take a moment to wrap your mind around this percentage; this is not about being a big player in the social media sector or the search engine industry – it’s about a sizeable chunk of all of humanity’s online interactions – in fact, more than 2 trillion of them. Every day. To put this into perspective, Google processes roughly the same number of searches… per year.

Apart from handling content delivery for half of the Global 500 companies, Akamai oversees at least some of the traffic for 91% of the top U.S. online retailers, over 80% of the global online media owners, and, as a bonus, all branches of the U.S. military.

All this, in itself, is of course not something sensational or worrisome; and Akamai is not an evil empire just because its content delivery network is so ubiquitous. And, of course – there are other players in the CDN market, such as Limelight and CloudFront (a division of Amazon, by the by) which also account for significant shares of the global web traffic. One could also argue that since those companies are in the content delivery business, they don’t actually affect the data that passes through them in any way; in other words, they simply act as efficient pipelines through which information flows faster and more securely. So, all’s fine, right?

The Centralization Dilemma

From the one hand, it seems logical, even quite natural, for industries like content delivery to tend towards aggregation: the more data centers you have around the world, the better your service gets; this leads to more profits and therefore additional resources for deploying further PoP’s (and further improving your competitiveness). It’s a feedback loop that favours the largest players, and helps them grow even larger. In an ideal world, this process leads to continuous quality improvements for the end users and content owners alike; in practice, there are more perspectives we might want to consider.

Firstly, there’s the scalability issue: while network bandwidth might not seem like a great concern for most but the largest content distributors like Netflix, it soon will: the continuing growth of video, wide adoption of 4K-quality media (and beyond), not to mention the looming VR revolution, all put increasing strains on the existing infrastructure.

Scientists and engineers are getting worried about the future of the internet in the world where bandwidth growth is not nearly as fast as the growth in storage capacity and therefore data utilization. Some CDNs are responding by diffusing their networks as much as possible, but there are clear limits to how far they can take this process – adding a PoP for every city block or every building would imply ridiculous costs and energy outlays. More users wanting more (and larger) content on more devices just means more stress on the networks that can’t quite catch up with growing demand.

Ironically, while a growing CDN is de-facto getting more and more spread out across multiple points of presence, centralization is occurring at the same time, as market power (and, potentially, control over data flows) get concentrated in the hands of a single business entity. And business entities are known to be okay with pursuing their self-interests at the expense of others, if they can get away with that.

This does not imply that companies like Akamai will one day decide to censor their data flows or spy over your personal data (of which they handle a LOT, each single day) – but even the slightest risk of these things realizing makes one think. Real-world analogies come to mind, for example: an autocratic ruler of a country might be benevolent and even intend to stay that way indefinitely, but it’s still not exactly a democracy; despite many people being tempted to use higher stability as an argument for concentrated political power, international migration flows clearly show that the vast majority still prefer to move to a democratic country when they get a chance.

In addition to the above, there’s also the question of security. Going back to the earlier argument about CDNs just being third-party channels for the traffic, here’s a story of a security breach at a major provider called Cloudflare, potentially exposing sensitive data for hundreds of thousands of clients.

However, it’s not only about the vulnerability of the entire network or individual points of presence – on a subtler level, a CDN might decide, at its own discretion, not to protect a website from an attack even if it has all the resources in place to do so. The story with an unhindered DDoS on Brian Krebs has set a curious precedent in that respect, and the CDN involved was, you guessed it, Akamai – remember, the one with the resources to move a third of the world’s traffic around, every single day.

We Can Do Better

One way to improve the current situation is evolution of the existing content delivery models to offer incremental but lasting improvements in various service aspects. One of the biggest bets for many newer CDNs which try to challenge the status quo has been security: contenders such as Distil Networks and Fastly are making sure that higher attack resistance is as synonymous with the term “content delivery” as loading speed. Others, like Section.io, are focusing their efforts on offering developer-friendly solutions, in contrast to the cumbersome legacy systems of the CDN veterans.

Another approach is to rethink how the entire system works, instead of deploying more data centers. There might be alternative ways of organizing the Web itself that, in some sense, have the idea of CDNs built in right from the start. One of the most promising developments in this respect is the IPFS (InterPlanetary File System), which combines existing and proven concepts like peer-to-peer file exchange (think BitTorrent), versioning (Git), with self-certified filesystems to arrive at a construct that allows creating inherently distributed networks.

The fact that each unique piece of content on such a network has its own unique identifier (address) ensures information permanence, while peer exchange protocols allow moving the files around without requiring trust in a central intermediary. As a bonus, the entire thing is compatible with the current design of the Web, i.e. can be used alongside it in a seamless fashion; in fact, anyone can try it out by using the open code provided by the authors.

One of the first potentially useful applications built on top of the IPFS is the Filecoin – essentially a series of cryptographic protocols for being able to pay ordinary people for storing your data remotely on their devices instead of the usual cloud. While generating significant commotion in the middle of 2017 with its record-breaking $257 million crowdsale, Filecoin focuses solely on the storage problem, almost brushing aside data delivery as a secondary matter.

Attempts are being made to close this gap. Based on the idea that distributed networks like IPFS are potentially much more helpful in actually delivering web content rather than storing large private files, upstarts like Nexusless are developing full-fledged CDN solutions that would allow replacing traditional PoP’s with consumer devices, essentially bringing the idea of edge servers as far as physically possible.

The Future Is Exciting

A vision of the world where anyone can pledge the currently unused resources of their personal computer or even smartphone to handle content delivery for all kinds of web services – and get paid for that – might be an idealistic one. However, as digital pioneer Jaron Lanier puts it in his book, “Who Owns the Future”, a more human-centric approach to networking is not impossible, rather is a question of network design. And new designs will surely keep popping up, testing novel solutions to the looming issues of scalability, centralization, and security.

It’s not yet clear as to who exactly owns the future of web content delivery – Akamai-type corporate giants or people at large, but it looks like it won’t be a boring ride.