What is the centralization that decentralized Web advocates are reacting
against? Clearly, it is the domination of the Web by the FANG
(Facebook, Amazon, Netflix, Google) and a few other large companies such
as the cable oligopoly.

These companies came to dominate the Web for economic not technological reasons.

Yet the decentralized Web advocates persist in believing that the answer is new technologies, which suffer from the same economic problems as the existing decentralized technologies underlying the "centralized" Web we have. A decentralized technology infrastructure is necessary for a decentralized Web but it isn't sufficient. Absent an understanding of how the rest of the solution is going to work, designing the infrastructure is an academic exercise.

I agree with Herbert about the desirability of his vision, but I also agree that it is unlikely. Below the fold I summarize Herbert's vision, then go through a long explanation of why I think he's right about the low likelihood of its coming into existence.
Herbert identifies three classes of decentralized Web technology and explains that he decided not to deal with these two:

Distributed file systems. Herbert is right about this. Internet-scale distributed file systems were first prototyped in the late 90s with Intermemory and Oceanstore, and many successors have followed in their footsteps. None have achieved sustainability or Internet platform scale. The reasons are many, the economic one of which I wrote about in Is Distributed Storage Sustainable?Betteridge's Law applies, so the answer is "no".

Trying by technical means to remove the need to have viable economics and governance is doomed to fail in the medium- let alone the long-term. What is needed is a solution to the economic and governance problems. Then a technology can be designed to work in that framework.

In the case of blockchain protocols, the mathematical and economic reasoning behind the safety of the consensus often relies crucially on the uncoordinated choice model, or the assumption that the game consists of many small actors that make decisions independently.

Herbert's reason for disregarding distributed file systems and blockchains is that they both involve entirely new protocols. He favors the approach being pursued at MIT in Sir Tim Berners-Lee's Solid project, which builds on existing Web protocols. Herbert's long experience convinces him (and me) that this is a less risky approach. My reason is different; they both reduce to previously unsolved problems.

The basic idea of Solid is that each person would own a Web domain, the "host" part of a set of URLs that they control. These URLs would be served by a "pod", a Web server controlled by the user that implemented a whole set of Web API standards, including authentication and authorization. Browser-side apps would interact with these pods, allowing the user to:

Export a machine-readable profile describing the pod and its capabilities.

Write content for the pod.

Control others access to the content of the pod.

Pods would have inboxes to receive notifications from other pods. So that, for example, if Alice writes a document and Bob writes a comment in his pod that links to it in Alice's pod, a notification appears in the inbox of Alice's pod announcing that event. Alice can then link from the document in her pod to Bob's comment in his pod. In this way, users are in control of their content which, if access is allowed, can be used by Web apps elsewhere.

In Herbert's vision, institutions would host their researchers "research pods", which would be part of their personal domain but would have extensions specific to scholarly communication, such as automatic archiving upon publication.

Herbert demonstrates that the standards and technology needed to implement his pod-based vision for scholarly communication exist, if the implementation is currently a bit fragile. But he concludes by saying:

By understanding why it is not feasible we may get new insights into what is feasible.

I'll take up his challenge, but in regard to the decentralized Web that underlies and is in some respects a precondition for his vision. I hope in a future post to apply the arguments that follow to his scholarly communication vision in particular.

The long explanation for why I agree with Herbert that the Solid future "will most likely never exist" starts here. Note that much of what I link to from now on is a must-read, flagged (MR). Most of them are long and cover many issues that are less, but still, related to the reason I agree with Herbert than the parts I cite.

Stross is very interested in what it means that today's tech
billionaires are terrified of being slaughtered by psychotic runaway
AIs. Like Ted Chiang and me,
Stross thinks that corporations are "slow AIs" that show what happens
when we build "machines" designed to optimize for one kind of growth
above all moral or ethical considerations, and that these captains of
industry are projecting their fears of the businesses they nominally
command onto the computers around them.

Stross uses the Paperclip Maximizer thought experiment to discuss how the goal of these "slow AIs", which is to maximize profit growth, makes them a threat to humanity. The myth is that these genius tech billionaire CEOs are "in charge", decision makers. But in reality, their decisions are tightly constrained by the logic embedded in their profit growth maximizing "slow AIs".

Here's an example of a "slow AI" responding to its Prime Directive and constraining the "decision makers". Dave Farber's IP list discussed Hiroko Tabuchi's New York Times article How Climate Change Deniers Rise to the Top in Google Searches, which described how well-funded climate deniers were buying ads on Google that appeared at the top of search results for climate change. Chuck McManis (Chuck & I worked together at Sun Microsystems. He worked at Google then built Blekko, another search engine.) contributed a typically informative response. As previously, I have Chuck's permission to quote him extensively:

publications, as recently as the early 21st century,
had a very strict wall between editorial and advertising. It compromises
the integrity of journalism if the editorial staff can be driven by the
advertisers. And Google exploited that tension and turned it into a
business model.

How did they do that?

When people started using Google as an 'answer this question'
machine, and then Google created a mechanism to show your [paid] answer first,
the stage was set for what has become a gross perversion of 'reference'
information.

Why would they do that? Their margins were under pressure:

The average price per click (CPC) of advertisements on Google sites has
gone down for every year, and nearly every quarter, since 2009. At the
same time Microsoft's Bing search engine CPCs have gone up. As the
advantage of Google's search index is eroded by time and investment,
primarily by Microsoft, advertisers have been shifting budget to be more
of a blend between the two companies. The trend suggests that at some
point in the not to distant future advertising margins for both engines
will be equivalent.

And their other businesses weren't profitable:

Google has scrambled to find an adjacent market, one
that could not only generate enough revenue to pay for the
infrastructure but also to generate a net income . Youtube, its biggest
success outside of search, and the closest thing they have, has yet to
do that after literally a decade of investment and effort.

So what did they do?

As a result
Google has turned to the only tools it has that work, it has reduced
payments to its 'affiliate' sites (AdSense for content payments), then
boosted the number of ad 'slots' on Google sites, and finally paying
third parties to send search traffic preferentially to Google (this too
hurts Google's overall search margin)

And the effect on users is:

On the search page, Google's bread and butter so to speak,
for a 'highly contested' search (that is what search engine marketeers
call a search query that can generate lucrative ad clicks) such as 'best
credit card' or 'lowest home mortgage', there are many web browser
window configurations that show few, if any organic search engine
results at all!

In other words, for searches that are profitable, Google has moved all the results it thinks are relevant off the first page and replaced them with results that people have paid to put there. Which is pretty much the definition of "evil" in the famous "don't be evil" slogan notoriously dropped in 2015. I'm pretty sure that no-one at executive level in Google thought that building a paid-search engine was a good idea, but the internal logic of the "slow AI" they built forced them into doing just that.

You cannot fix Facebook without completely gutting its advertising-driven business model.

And because he is required by Wall Street to put his shareholders above all else, there’s no way in hell Zuckerberg will do that.

Put another way, Facebook has gotten too big to pivot to a new, more “sustainable” business model.
...
If you’ve read “Lost Context,”
you’ve already been exposed to my thinking on why the only way to “fix”
Facebook is to utterly rethink its advertising model. It’s this model
which has created nearly all the toxic externalities Zuckerberg is
worried about: It’s the honeypot which drives the economics of spambots
and fake news, it’s the at-scale algorithmic enabler which attracts
information warriors from competing nation states, and it’s the reason
the platform has become a dopamine-driven engagement trap where time is
often not well spent.

I have personal experience of this problem. In the late 80s I foresaw a bleak future for Sun Microsystems. Its profits were based on two key pieces of intellectual property, the SPARC architecture and the Solaris operating system. In each case they had a competitor (Intel and Microsoft) whose strategy was to make owning that kind of IP too expensive for Sun to compete. I came up with a strategy for Sun to undergo a radical transformation into something analogous to a combination of Canonical and an App Store. I spent years promoting and prototyping this idea within Sun.

One of the reasons I have great respect for Scott McNealy is that he gave me, an engineer talking about business, a very fair hearing before rejecting the idea, saying "Its too risky to do with a Fortune 100 company". Another way of saying this is "too big to pivot to a new, more “sustainable” business model". In the terms set by Sun's "slow AI" Scott was right and I was wrong. Sun was taken over by Oracle in 2009; their "slow AI" had no answer for the problems I identified two decades earlier. But in those two decades Sun made its shareholders unbelievable amounts of money.

with the Web technology available today, publishing can potentially
happen independently of publishers. If authors started depositing their
papers directly into a central repository, they could bypass publishers
and make it freely available.

He started the first commercial open-access publisher, BioMed Central, in 2000 (the Springer "slow AI" bought it in 2008). In 2002 came the Budapest Open Access Initiative:

By "open access" to this literature, we mean its free availability on
the public internet, permitting any users to read, download, copy,
distribute, print, search, or link to the full texts of these articles,
crawl them for indexing, pass them as data to software, or use them for
any other lawful purpose, without financial, legal, or technical
barriers other than those inseparable from gaining access to the
internet itself. The only constraint on reproduction and distribution,
and the only role for copyright in this domain, should be to give
authors control over the integrity of their work and the right to be
properly acknowledged and cited.

Sixteen years later, the "slow AIs" which dominate scholarly publishing have succeeded in growing profits so much that Roger Schonfeld can tweet:

I want to know how anyone can possibly suggest that Elsevier is an
enemy of open access. I doubt any company today profits more from OA and
its growth!

What Elsevier means by "open access" is a long, long way from the Budapest definition. The Open Access advocates, none of them business people, set goals which implied the demise of Elsevier and the other "slow AIs" without thinking through how the "slow AIs" would react to this existential threat. The result was that the "slow AIs" perverted the course of "open access" in ways that increased their extraction of monopoly rents, and provided them with even more resources to buy up nascent and established competitors.

Now the "slow AIs" dominate not just publishing, but the entire infrastructure of science. If I were Elsevier's "slow AI" I would immediately understand that Herbert's "research pods" needed to run on Elsevier's infrastructure. Given university IT departments current mania for outsourcing everything to "the cloud" this would be trivial to arrange. They've already done it to institutional repositories. Elsevier would then be able to, for example, use a Microsoft-like "embrace, extend and extinguish" strategy to exploit its control over researcher's pods.

What people mean by saying "the Web is centralized" is that it is dominated by a small number of extremely powerful "slow AIs", the FAANGs (Facebook, Apple, Amazon, Netflix, Google) and the big telcos. None of the discussion of the decentralized Web I've seen is about how to displace them, its all about building a mousetrap network infrastructure so much better along some favored axes that, magically, the world will beat a path to their door.

This is so not going to happen.

For example, you could build a decentralized, open source social network system. In fact, people did. It is called Diaspora and it launched in a blaze of geeky enthusiasm in 2011. Diaspora is one of the eight decentralization initiatives studied by the MIT Media Lab'sDefending Internet Freedom through Decentralization (MR) report:

The alpha release of the Diaspora software was deeply problematic, riddled with basic security errors in the code. At the same time, the founders of the project received a lot of pressure from Silicon Valley venture capitalists to “pivot” the project to a more profitable business model. Eventually the core team fell apart and the Diaspora platform was handed over to the open source community, who has done a nice job of building out a support website to facilitate new users in signing up for the service. Today it supports just under 60,000 active participants, but the platform remains very niche and turnover of new users is high.

Facebook has 1.37*109 daily users, so it is about 22,800 times bigger than Diaspora. Even assuming Diaspora was as good as Facebook, an impossible goal for a small group of Eben Moglen's students, no-one had any idea how to motivate the other 99.996% of Facebook users to abandon the network where all their friends were and restart building their social graph from scratch. The fact that after 6 years Diaspora has 60K active users is impressive for an open source project, but it is orders of magnitude away from the scale needed to be a threat to Facebook. We can see this because Facebook hasn't bothered to react to it.

Suppose the team of students had been inspired, and built something so much better than Facebook along axes that the mass of Facebook users cared about (which don't include federation, censorship resistance, open source, etc.) that they started to migrate. Facebook's "slow AI" would have reacted in one of two ways. Either the team would have been made a financial offer they couldn't refuse, which wouldn't have made a dent in the almost $40B in cash and short-term investments on Facebook's balance sheet. Or Facebook would have tasked a few of their more than 1000 engineers to replicate the better system. They'd have had an easy job because (a) they'd be adding to an existing system rather than building from scratch, and (b) because their system would be centralized, so wouldn't have to deal with the additional costs of decentralization.

Almost certainly Facebook would have done both. Replicating an open source project in-house is very easy and very fast. Doing so would reduce the price they needed to pay to buy the startup. Hiring people good enough to build something better than the existing product is a big problem for the FAANGs. The easiest way to do it is to spot their startup early and buy it. The FAANGs have been doing this so effectively that it no longer makes sense to do a startup in the Valley with the goal of IPO-ing it; the goal is to get bought by a FAANG.

Lets see what happens when one of the FAANGs actually does see something as a threat. Last January Lina M. Kahn of the Open Markets team at the New America Foundation published Amazon's Antitrust Paradox (MR) in the Yale Law Review. Her 24,000-word piece got a lot of well-deserved attention for describing how platforms evade antitrust scrutiny. In August, Barry Lynn, Kahn's boss and the entire Open Markets team were ejected from the New America Foundation. Apparently, the reason was this press release commenting favorably on Google's €2.5 billion loss in an antitrust case in the EU. Lynn claims that:

hours after his press release went online, [New America CEO] Slaughter called him up and said: “I just got off the phone with Eric Schmidt and he is pulling all of his money,”

The FAANGs' "slow AIs" understand that antitrust is a serious threat. €2.5 billion checks get their attention, even if they are small compared to their cash hoards. The PR blowback from defenestrating the Open Markets team was a small price to pay for getting the message out that advocating for effective antitrust enforcement carried serious career risks.

This was a FAANG reacting to a law journal article and a press release. "All of his money" had averaged about $1M/yr over two decades. Imagine how FAANGs would react to losing significant numbers of users to a decentralized alternative!

the current framework in antitrust—specifically its pegging competition to “consumer welfare,” defined as
short-term price effects—is unequipped to capture the architecture of market
power in the modern economy. We cannot cognize the potential harms to
competition posed by Amazon’s dominance if we measure competition primarily
through price and output. Specifically, current doctrine underappreciates the
risk of predatory pricing and how integration across distinct business lines
may prove anticompetitive. These concerns are heightened in the context of
online platforms for two reasons. First, the economics of platform markets create
incentives for a company to pursue growth over profits, a strategy that
investors have rewarded. Under these conditions, predatory pricing becomes
highly rational—even as existing doctrine treats it as irrational and therefore
implausible. Second, because online platforms serve as critical intermediaries,
integrating across business lines positions these platforms to control the
essential infrastructure on which their rivals depend. This dual role also
enables a platform to exploit information collected on companies using its
services to undermine them as competitors.

In the 30s antitrust was aimed at preserving a healthy market by eliminating excessive concentration of market power. But:

Due to a change
in legal thinking and practice in the 1970s and 1980s, antitrust law now
assesses competition largely with an eye to the short-term interests of
consumers, not producers or the health of the market as a whole; antitrust doctrine
views low consumer prices, alone, to be evidence of sound competition. By this
measure, Amazon has excelled; it has evaded government scrutiny in part through
fervently devoting its business strategy and rhetoric to reducing prices for
consumers.

Shop, Ikebukuro, Tokyo

The focus on low prices for "consumers" rather than "customers" is especially relevant for Google and Facebook; it is impossible to get monetary prices lower than those they charge "consumers". The prices they charge the "customers" who buy ad space from them are another matter, but they don't appear to be a consideration for current antitrust law. Nor is the non-monetary price "consumers" pay for the services of Google and Facebook in terms of the loss of privacy, the spam, the fake news, the malvertising and the waste of time.

Perhaps the reason for Google's dramatic reaction to the Open Markets team was that they were part of a swelling chorus of calls for antitrust action against the FAANGs from both the right and the left. Roger McNamee (previously) was an early investor in Facebook and friend of Zuckerberg's, but in How to Fix Facebook — Before It Fixes Us (MR) even he voices deep concern about Facebook's effects on society. He and ethicist Tristan Harris provide an eight-point prescription for mitigating them:

Ban bots.

Block further acquisitions.

"be transparent about who is behind political and issues-based communication"

"be more transparent about their algorithms"

"have a more equitable contractual relationship with users"

Impose "a limit on the commercial exploitation of consumer data by internet platforms"

"consumers, not the platforms, should own their own data"

Why would the Facebook "slow AI" do any of these things when they're guaranteed to decrease its stock price? The eighth is straight out of Lina Kahn:

we should consider that the time has come to revive the country’s traditional approach to monopoly. Since the Reagan era, antitrust law has operated under the principle that monopoly is not a problem so long as it doesn’t result in higher prices for consumers. Under that framework, Facebook and Google have been allowed to dominate several industries—not just search and social media but also email, video, photos, and digital ad sales, among others—increasing their monopolies by buying potential rivals like YouTube and Instagram. While superficially appealing, this approach ignores costs that don’t show up in a price tag. Addiction to Facebook, YouTube, and other platforms has a cost. Election manipulation has a cost. Reduced innovation and shrinkage of the entrepreneurial economy has a cost. All of these costs are evident today. We can quantify them well enough to appreciate that the costs to consumers of concentration on the internet are unacceptably high.

McNamee understands that the only way to get Facebook to change its ways is the force of antitrust law.

Ultimately, the goal of this project is to render platforms like Facebook and Twitter as merely “front-end” services that present a user’s data, rather than silos for millions of people’s personal data. To this end, Solid aims to support users in controlling their own personal online datastore, or “pod,” where their personal information resides. Applications would generally run on the client-side (browser or mobile phone) and access data in pods via APIs based on HTTP.

In other words, to implement McNamee's #7 prescription.

Why do you think McNamee's #8 talks about the need to "revive the country’s traditional approach to monopoly"? He understands that having people's personal data under their control, not Facebook's, would be viewed by Facebook's "slow AI" as an existential threat. Exclusive control over the biggest and best personal data of everyone on the planet, whether or not they have ever created an account, is the basis on which Facebook's valuation rests.

The approach of Solid towards promoting interoperability and platform-switching is admirable, but it begs the question: why would the incumbent “winners” of our current system, the Facebooks and Twitters of the world, ever opt to switch to this model of interacting with their users? Doing so threatens the business model of these companies, which rely on uniquely collecting and monetizing user data. As such, this open, interoperable model is unlikely to gain traction with already successful large platforms. While a site like Facebook might share content a user has created–especially if required to do so by legislation that mandates interoperability–it is harder to imagine them sharing data they have collected on a user, her tastes and online behaviors. Without this data, likely useful for ad targeting, the large platforms may be at an insurmountable advantage in the contemporary advertising ecosystem.

The report completely fails to understand the violence of the reaction Solid will face from the FAANGs "slow AIs" if it ever gets big enough for them to notice.

Note that the report fails to understand that you don't have to be a Facebook user to have been extensively profiled. Facebook's "slow AI" is definitely not going to let go of the proprietary data it has collected (and in many cases paid other data sources for) about a person. Attempts to legislate this sharing in isolation would meet ferocious lobbying, and might well be unconstitutional. Nor is it clear that, even if legislation passed, the data would be in a form usable by the person, or by other services. History tends to show that attempts to force interoperability upon unwilling partners are easily sabotaged by them.

consumers, not the platforms, should own their own data. In the case of Facebook, this includes posts, friends, and events—in short, the entire social graph. Users created this data, so they should have the right to export it to other social networks. Given inertia and the convenience of Facebook, I wouldn’t expect this reform to trigger a mass flight of users. Instead, the likely outcome would be an explosion of innovation and entrepreneurship. Facebook is so powerful that most new entrants would avoid head-on competition in favor of creating sustainable differentiation. Start-ups and established players would build new products that incorporate people’s existing social graphs, forcing Facebook to compete again.

After all, allowing users to export their data from Facebook doesn't prevent Facebook maintaining a copy. And you don't need to be a Facebook user for them to make money from data they acquire about you. Note that, commendably, Google has for many years allowed users to download the data they create in the various Google systems (but not the data Google collects about them) via the Data Liberation Front, now Google TakeOut. It hasn't caused their users to leave.

No alternate social network can succeed without access to the data Facebook currently holds. Realistically, if this is to change, there will be some kind of negotiation. Facebook's going-in position will be "no access". Thus the going-in position for the other side needs to be something that Facebook's "slow AI" will think is much worse than sharing the data.

We may be starting to see what the something much worse might be. In contrast to the laissez-faire approach of US antitrust authorities, the EU has staked out a more aggressive position. It fined Google the €2.5 billion that got the Open Markets team fired. And, as Cory Doctorow reports (MR):

Back in 2016, the EU passed the General Data Protection Regulation, a far-reaching set of rules to protect the personal information and privacy of Europeans that takes effect this coming May.

Under the new directive, every time a European's personal data is
captured or shared, they have to give meaningful consent, after being
informed about the purpose of the use with enough clarity that they can
predict what will happen to it. Every time your data is shared with
someone, you should be given the name and contact details for an
"information controller" at that entity. That's the baseline: when a
company is collecting or sharing information about (or that could
reveal!) your "racial or ethnic origin, political opinions, religious or
philosophical beliefs, or trade union membership, … [and] data
concerning health or data concerning a natural person’s sex life or
sexual orientation," there's an even higher bar to hurdle.

Pagefair has a detailed explanation of what this granting of granular meaningful consent would have to look like. It is not a viable user interface to the current web advertising ecosystem of real-time auctions based on personal information.

All of these companies need to get consent

Here is Pagefair's example of what is needed to get consent from each of them.

There is no obvious way the adtech industry in its current form can
comply with these rules, and in the nearly two years they've had to
adapt, they've done virtually nothing about it, seemingly betting that
the EU will just blink and back away, rather than exercise its new
statutory powers to hit companies for titanic fines, making high profile
examples out of a few sacrificial companies until the rest come into
line.

But this is the same institution that just hit Google with a $2.73 billion fine.
They're spoiling for this kind of fight, and I wouldn't bet on them
backing down. There's no consumer appetite for being spied on online ... and the companies
involved are either tech giants that everyone hates (Google, Facebook),
or creepy data-brokers no one's ever heard of and everyone hates on
principle (Acxiom). These companies have money, but not constituencies.

Meanwhile, publishers are generally at the mercy of the platforms, and I
assume most of them are just crossing their fingers and hoping the
platforms flick some kind of "comply with the rules without turning off
the money-spigot" switch this May.

Websites, apps, and adtech vendors, should switch from using personal
data to monetize direct and RTB advertising to “non-personal data”.
Using non-personal, rather than personal, data neutralizes the risks of
the GDPR for advertisers, publishers, and adtech vendors. And it
enables them to address the majority (80%-97%) of the audience that will
not give consent for 3rd party tracking across the web.

The EU is saying "it is impractical to monetize personal information". Since Facebook's and Google's business models depend on monetizing personal information, this is certainly looks like "something worse" than making it portable.

I remember at Esther Dyson's 2001 conference listening to the CEO of
American Express explain how they used sophisticated marketing
techniques to get almost all their customers to opt-in to information
sharing. If I were Facebook's or Google's "slow AI" I'd be wondering if I
could react to the GDPR by getting my users to opt-in to my data
collection, and structuring things so they wouldn't opt-in to everyone
else's. I would be able to use their personal information, but I
wouldn't be able to share it with anyone else. That is a problem for
everyone else, but for me its a competitive advantage.

WeChat, the popular mobile application from Tencent Holdings, is set to
become more indispensable in the daily lives of many Chinese consumers
under a project that turns it into an official electronic personal
identification system.

The US has enabled personal information to be monetized, but seems to be facing a backlash from both right and left.

The EU seems determined to eliminate, or at least place strict limits on, monetizing of personal information.

Balkanization of the Web seems more likely than decentralization.

If a decentralized Web doesn't achieve mass participation, nothing has really changed. If it does, someone will have figured out how to leverage antitrust to enable it. And someone will have designed a technical infrastructure that fit with and built on that discovery, not a technical infrastructure designed to scratch the itches of technologists.

44 comments:

"The company recognises that science publishing will become a service that scientists will largely run themselves. In a sense, it always has been with scientists producing the science, editing the journals, peer reviewing the studies, and then reading the journals. ... Elsevier have recognised the importance of this trend and are creating their own software platforms to speed up and make cheaper the process of publishing science.

But how, I wondered, can Elsevier continue to make such big profits from science publishing? Now, I think I see. The company thinks that there will be one company supplying publishing services to scientists—just as there is one Amazon, one Google, and one Facebook; and Elsevier aims to be that company. But how can it make big profits from providing a cheap service?

The answer lies in big data. ... Elsevier will come to know more about the world’s scientists—their needs, demands, aspirations, weaknesses, and buying patterns—than any other organisation. The profits will come from those data and that knowledge. The users of Facebook are both the customers and the product, and scientists will be both the customers and the product of Elsevier."

"In 2008 I was on the jury for the Elsevier Grand Challenge, a competition Elsevier ran with a generous prize for the best idea of what could be done with access to their entire database of articles. This got a remarkable amount of attention from some senior managers. Why did they sponsor the competition? They understood that, over time, their ability to charge simply for access to the raw text of scholarly articles will go away. Their idea was to evolve to charging for services based on their database instead."

I don't think decentralization or Balkanization are the only options. Being a strong believer and creator of decentralized technology, I still intend to maintain my Facebook account for the years to come—while simultaneously also using decentralized storage and applications. There is room for both options to exist in parallel, and they would just have different usages.

I see a strong potential for Solid-like solutions in several business sectors, such as the legal and medical domains, where special data requirements make the decoupling of data and apps a strong advantage.

We should stop seeing decentralized solutions as competitors to Facebook, but rather as useful platforms in their own right, which already have relevant use cases.

Hi David, it would be good to catch up sometime and have a chat about this. I’m now at AWS and though in some ways Cloud is centralizing it’s also decentralizing in other ways. A lot of companies are shutting down entire data centers and moving core backend systems to a more distributed cloud based architecture.

- Architectural (de)centralization — how many physical computers is a system made up of? How many of those computers can it tolerate breaking down at any single time?

- Political (de)centralization — how many individuals or organizations ultimately control the computers that the system is made up of?

- Logical (de)centralization— does the interface and data structures that the system presents and maintains look more like a single monolithic object, or an amorphous swarm? One simple heuristic is: if you cut the system in half, including both providers and users, will both halves continue to fully operate as independent units?

As far as I can see, systems built on AWS may be architecturally decentralized (but are more likely just distributed), but are politically and logically centralized so, even if decentralization delivered its promised advantages, they would get few of them.

I believe that AWS is better at running data centers than companies. Whether they are enough better to outweigh the added risk that comes from correlations between the failure of my system and failures of other systems at AWS that my system depends upon (say my supply chain partners) is an interesting question.

As far as I can see the biggest selling point of AWS is that it provides the in-house IT organization some place to point fingers when things go wrong. Its the modern equivalent of "no-one ever got fired for buying IBM".

Just to clarify (a minor point given everything that you've mentioned): the "Solid approach" doesn't mandate that we must all have our domains and run our personal online datastores (pod) there. It is certainly one of the ways of going at it. The bigger story there is that our assets are access controlled working alongside a global authentication mechanism. The bottom line there is about where one places their trust, eg. if a researcher trusts their institution to take care of their pod, that's all good - as exemplified in Herbert's talk.

To support your argument towards how the "big players" are not particularly concerned - at least in public at this time - about decentralised initiatives, we can simply look at their involvement in standardisation efforts. Namely speaking: the W3C Social Web Working Group which went on a long journey in coming up with recommendations towards interoperable protocols and data exchange on the Web, with emphasis on "social" stuff. The big players, while having W3C Member status did not participate in this WG. I think that speaks in volumes.

My concern is, as I briefly mentioned in the post, that research institutions are in a desperate rush to outsource everything to do with IT to "The Cloud". There are a number of reasons, including the fact that they can't hire enough skilled people, and that outsourcing means that the CIO has some place to point the finger when things go wrong. "The Cloud" generally means AWS, but in the case of researcher's "pods" it would definitely mean Elsevier (who would probably run on AWS).

So the "pods" would all end up controlled by Elsevier. Which in the big picture would not be much of a change, and would probably end up being even more expensive than the system we have now.

I agree with you on the likelihood of Elsevier (or some other company) seamlessly offering pod services to institutions. While there are alarm bells all around that for never ending vendor lock-in - eg "embrace, extend and extinguish" - I suppose it might settle down on which core features of the system are non-proprietary. For instance, could institutions or researchers decide to pack up their bags and go to the next "hosting" provider, ie. continuous and ubiquitous offering of the research objects that are part of the commons? Would the generated data be reusable by any application that conform to some open standards? If I was shopping around for a service or a tool, I'd check to see if it passes an acid test along the lines of https://linkedresearch.org/rfc#acid-test. From that angle, the possibility of "controlled by Elsevier" could mean anything. Is the web hosting provider for my domain controlling my pod? They may be mining the data (including all the interactions that go with it) but I suppose I can't be certain. To escape that, I'll have to run my own hosting box.

I assume that most people agree that the major academic publishers will continue to milk the system. In the meantime, the best I think we (researchers and institutions) can do is embrace the feasibility of this lets-get-all-decentralised-bandwagon because the current path is not quite working in our favour. If we are lucky, the effect of this may be that the playing field will even out for new service/tool providers to participate, and maybe diversity in how one can participate - wishful thinking?

Just to contrast: ORCID is a great service for many, but what's the end result or some of the consequences of such effort? New scholarly systems are being built that solely recognise identifiers that virtually include ORCID's domain name. One is literally forbidden to participate / make their contributions to humanity - and be acknowledged by it - unless they create an ORCID account. This is not ORCID's fault but precisely what happens when we go all in on centralised systems, no matter how shiny or useful it may seem at first. Its design has consequences. DOI is precisely the same story. Your "scholarly" article doesn't have a DOI? It is considered to be equivalent to a random blogpost on the Web regardless of what it says, or dare I say, a "preprint". Again, this is not because ORCID or DOI are bad or wrong designs. The bigger question in my opinion is can we have different systems cooperate and be welcome to participate in the big Scholarly Bazaar?

For our little corner in Web Science - but not exclusive to - I've tried to capture some our problems and steps towards a paradigm shift which you might be interested in having a look: http://csarven.ca/web-science-from-404-to-200. Same old, same old :)

I little correction regarding the section about solid in this insightful article:

A WebID (HTTP URI that identifies an Agen) and its associated WebID-Profile doc (collection of RDF sentences using various notations) only require users to possess Read-Write privileges over a Folder (a/k/a LDPContainer instance these days). Fundamentally, domain ownership requirement is one of the problems with current storage models (e.g., systems where identification depends on ".well-known" pattern), in regards to a viable Read-Write Web etc..

Hi David - fantastic article (I've an even bigger MR backlog now!). But I didn't understand your response to Ruben. Surely both centralised and decentralised systems have their place (e.g. Facebook 'cos all my mates are there already, or Facebook-decentralised 'cos they pay me per view, or allow non-Facebookers to comment on or annotate my posts, or whatever). Noone is suggesting *everything* must be decentralised for us to start 'decentralising the web', incrementally, as deemed appropriate by individuals, are they?

PMcB, what people mean by "the Web is centralized" is that it is dominated by the FAANGs. If you want to change that, you have to displace some or all of the FAANGs. Otherwise the Web will still be centralized i.e. dominated by the FAANGs. Its fine to build something, e.g. Diaspora, that attracts say 60K users. But that hasn't changed the big picture at all.

What I'm trying to do is to get people to think about how to displace the FAANGs instead of thinking about the neat-oh technology they think it'd be cool to build for themselves and a few friends. Decentralizing the Web: It Isn't About The Technology.

David - hhhmmm... Ok, I think I see where you're coming from now, and it's a good point. But for whatever reason, one word popped into my head after reading your response - Tesla. Maybe it'll take a social-media breach on the scale of Equifax to wake up the general populace to the consequences of Facebook et al, but regardless, like autonomous or electric-only cars, I think (purely personal opinion) that a large proportion of the Web will become decentralised (how 'large' is just a reflection of how naive you are I guess!), but my conviction is based on it simply being the more 'correct' thing to do - like electric cars (full disclosure: I'm a cyclist, and therefore inherently dislike all cars!). I know a counter-argument is 'but where's the money in decentralisation', but wasn't that exactly the argument 10 years ago with electric cars too (which seems so utterly myopic and 'missing the whole point' today)??

Tesla is an amazing achievement, but its still a very small car company. Its a little more than 1/3 the size of Porsche. But the key point is that, right from the start with the Roadster, Elon Musk had an explanation for why people would buy the cars that wasn't simply that they were electric. They were fast and fun to drive.

"being the more 'correct' thing to do" is not an explanation for why people will migrate off the FAANGs.

"For the most part, Facebook and Google prevent you from using their products if you decline to agree to their entire terms of service. You cannot pick and choose what to agree to and still use their free services.

The GDPR changes that by requiring online companies, in some cases, to get permission from each individual user to collect, share and combine their data for use by advertisers. Companies will not be allowed to ban people from their services who decline to share their data for advertising purposes. There are 734 million EU residents who will soon be able to opt out of helping Facebook and Google make money. If companies do not comply with the new regulations they will face fines totaling four percent of their global revenues."

These 9- and 10-figure invoices have made deathbed converts out of Big Tech, who are now in a made scramble to comply with the GDPR before its fines of the greater of 4% of global total profit or 20 million Euros kick in this May."

Adam Ludwin's A Letter To Jamie Dimon is worth a read. Even though I don't agree with all of it, he makes some very good points, such as:

"Since Ethereum is a platform, its value is ultimately a function of the value of the applications built on top. In other words, we can ask if Ethereum is useful by simply asking if anything that has been built on Ethereum is useful. For example, do we need censorship resistant prediction markets? Censorship resistant meme playing cards? Censorship resistant versions of YouTube or Twitter?

While it’s early, if none of the 730+ decentralized apps built on Ethereum so far seem useful, that may be telling. Even in year 1 of the web we had chat rooms, email, cat photos, and sports scores. What are the equivalent killer applications on Ethereum today?"

"Mining and oil companies exploit the physical environment; social media companies exploit the social environment. This is particularly nefarious because social media companies influence how people think and behave without them even being aware of it. This has far-reaching adverse consequences on the functioning of democracy, particularly on the integrity of elections.

The distinguishing feature of internet platform companies is that they are networks and they enjoy rising marginal returns; that accounts for their phenomenal growth. The network effect is truly unprecedented and transformative, but it is also unsustainable. It took Facebook eight and a half years to reach a billion users and half that time to reach the second billion. At this rate, Facebook will run out of people to convert in less than 3 years."

"The exceptional profitability of these companies is largely a function of their avoiding responsibility for– and avoiding paying for– the content on their platforms.

They claim they are merely distributing information. But the fact that they are near- monopoly distributors makes them public utilities and should subject them to more stringent regulations, aimed at preserving competition, innovation, and fair and open universal access."

"Why should we break up big tech? Not because the Four are evil and we’re good. It’s because we understand that the only way to ensure competition is to sometimes cut the tops off trees, just as we did with railroads and Ma Bell. This isn’t an indictment of the Four, or retribution, but recognition that a key part of a healthy economic cycle is pruning firms when they become invasive, cause premature death, and won’t let other firms emerge. The breakup of big tech should and will happen, because we’re capitalists."

"a court had found Facebook’s use of personal data to be illegal because the U.S. social media platform did not adequately secure the informed consent of its users.

The verdict, from a Berlin regional court, comes as Big Tech faces increasing scrutiny in Germany over its handling of sensitive personal data that enables it to micro-target online advertising.

The Federation of German Consumer Organisations (vzvb) said that Facebook’s default settings and some of its terms of service were in breach of consumer law, and that the court had found parts of the consent to data usage to be invalid. "

"The question of whether decentralized or centralized systems will win the next era of the internet reduces to who will build the most compelling products, which in turn reduces to who will get more high quality developers and entrepreneurs on their side."

The first part of the sentence is right, the second part is techno-optimism like most of the rest of the essay.

"Odds are high you'd see them emerge first in criminal enterprises, as ways of setting up entities that engage in nefarious activities but cannot be meaningfully punished (in human terms, anyway), even if they're caught, he argues. Given their corporate personhood in the US, they'd enjoy the rights to own property, to enter into contracts, to legal counsel, to free speech, and to buy politicians -- so they could wreak a lot of havoc."

Not that the current "slow AIs" can be "meaningfully punished" if they engage in "nefarious activities".

"But the decision in a case currently before the Supreme Court could block off that path, by effectively shielding big tech platforms from serious antitrust scrutiny. On Monday the Court heard Ohio v. American Express, a case centering on a technical but critical question about how to analyze harmful conduct by firms that serve multiple groups of users. Though the case concerns the credit card industry, it could have sweeping ramifications for the way in which antitrust law gets applied generally, especially with regards to the tech giants."

"Amazon is far from invulnerable. All the same old red flags are there—a puny 2.7 percent e-commerce profit in North America, massive outlays to establish delivery routes abroad—but few are paying attention. Anyone buying a share of Amazon stock today is agreeing to pay upfront for the next 180 years of profit. By one measure, it’s generating far less cash than investors believe. And its biggest risk may be the fear of its power in Washington, New York, and Brussels, a possible prelude to regulatory crackdown." [my emphasis]

"An index of 10 tech growth shares pushed its advance to 23 percent so far this year, giving the group an annualized return since early 2016 of 67 percent. That frenzied pace tops the Nasdaq Composite Index’s 66 percent return in the final two years of the dot-com bubble." writes Lu Wang at Bloomberg:

"In addition to the quartet of Facebook, Amazon, Netflix and Google, the NYSE index also includes Apple, Twitter, Alibaba, Baidu, Nvidia and Tesla. These companies have drawn money as investors bet that their dominance in areas from social media to e-commerce will foster faster growth. ... At 64 times earnings, the companies in the NYSE FANG+ Index are valued at a multiple that’s almost three times the broader gauge’s. That compared with 2.7 in March 2000."

"In 2007, the Guardian's Victor Keegan published "Will MySpace ever lose its monopoly?" in which he enumerated the unbridgeable moats and unscalable walls that "Rupert Murdoch's Myspace" had erected around itself, evaluating all the contenders to replace Myspace and finding them wanting."

I could be wrong! But that was then and this is now. Facebook is far bigger and more embedded than MySpace ever was.

In The Real Villain Behind Our New Gilded Age Eric Posner and Glen Weyl reinforce the meme that the absence of effective antitrust policy is the cause of many of society's ills, including inequality, slow economic growth and the FAANGs:

"the rise of the internet has bestowed enormous market power on the tech titans — Google, Facebook — that rely on networks to connect users. Yet again, antitrust enforcers have not stopped these new robber barons from buying up their nascent competitors. Facebook swallowed Instagram and WhatsApp; Google swallowed DoubleClick and Waze. This has allowed these firms to achieve near-monopolies over new services based on user data, such as training machine learning and artificial intelligence systems. As a result, antitrust authorities allowed the creation of the world’s most powerful oligopoly and the rampant exploitation of user data."

"It's not that a blockchain-based web isn't possible. After all, the original web was decentralised, too, and came with the privacy guarantees that blockchain-based options today purport to deliver. No, the problem is people.

As user interface designer Brennan Novak details, though the blockchain may solve the crypto crowd's privacy goals, it fails to offer something as secure and easy as a (yes) Facebook or Google login: 'The problem exists somewhere between the barrier to entry (user-interface design, technical difficulty to set up, and overall user experience) versus the perceived value of the tool, as seen by Joe Public and Joe Amateur Techie.'"

" Economists Erik Brynjolfsson, Felix Eggers and Avinash Gannamaneni have published an NBER paper (Sci-Hub mirror) detailing an experiment where they offered Americans varying sums to give up Facebook, and then used a less-rigorous means to estimate much much Americans valued other kinds of online services: maps, webmail, search, etc.

They concluded that 20% of Facebook's users value the service at less than a dollar a month, and at $38/month, half of Facebook's users would quit.

Search is the most valued online service -- the typical American won't give up search for less than $17,500/year -- while social media is the least valuable ($300)."

"In 1993, John Gilmore famously said that “The Internet interprets censorship as damage and routes around it.” That was technically true when he said it but only because the routing structure of the Internet was so distributed. As centralization increases, the Internet loses that robustness, and censorship by governments and companies becomes easier." from Censorship in the Age of Large Cloud Providers by Bruce Schneier.

I believe that current efforts to decentralize the Web won't be successful for economic reasons. Bruce adds another reason, because governments everywhere prefer that it be centralized. But that doesn't mean either of us think decentralizing the Web isn't important.

Felix Salmon's The False Tale of Amazon's Industry-Conquering Juggernaut makes the case that Amazon hasn't been as successful at disrupting established industries (except book publishing) as the stock market thinks. But that doesn't mean startups shouldn't be very afraid; because Amazon owns their computing infrastructure it has good visibility into which startups are worth buying or nuking.

"Up until just last year, it was impossible to add DuckDuckGo to Chrome on Android, and it is still impossible on Chrome on iOS. We are also not included in the default list of search options like we are in Safari, even though we are among the top search engines in many countries. The Google search widget is featured prominently on most Android builds and is impossible to change the search provider. For a long time it was also impossible to even remove this widget without installing a launcher that effectively changed the whole way the OS works. Their anti-competitive search behavior isn't limited to Android. Every time we update our Chrome browser extension, all of our users are faced with an official-looking dialogue asking them if they'd like to revert their search settings and disable the entire extension. Google also owns http://duck.com and points it directly at Google search, which consistently confuses DuckDuckGo users."

Senator Elizabeth Warren's Accountable Capitalism Act is a thoughtful attempt to re-program US companies' "Slow AIs". It would force companies with more than $1B in revenue to obtain a Federal rather than a state charter, eliminating the race to the bottom among states to charter companies.

40% of these companies' boards would have to be elected by workers. Directors and officers could not sell shares within 5 years of acquiring them, nor within 3 years of any stock buyback. Any political expenditures would require 75% votes of directors and shareholders. And, most importantly, it would permit the Federal government to revoke a company's charter for "a history of egregious and repeated illegal conduct and has failed to take meaningful steps to address its problems" such as those, for example, at Wells Fargo. States (cough, cough, Delaware) would never revoke a large company's charter for fear of sparking an exodus.

"Google is directly profiting by letting ad fraud run rampant at the expense of the companies who buy or sell ads on its platform.

However, Warner is just as mad about the FTC as he is about Google, claiming the FTC has failed to take action against the Mountain View-based company for more than two years since he and New York Democrat Senator Chuck Schumer first wrote the agency about Google's ad fraud problem."

" While Facebook's decision to smear critics instead of owning their own obvious dysfunction is clearly idiotic, much of the backlash has operated under the odd belief that Facebook's behavior is some kind of exception, not the norm. Countless companies employ think tanks, consultants, bogus news ops, PR firms, academics, and countless other organizations to spread falsehoods, pollute the public discourse, and smear their critics on a daily basis. It's a massive industry. Just ask the telecom sector.

"has been developing a theory of “information fiduciaries” for the past five years or so. The theory is motivated by the observation that ordinary people are enormously vulnerable to and dependent on the leading online platforms—Facebook, Google, Twitter, Uber, and the like. To mitigate this vulnerability and ensure these companies do not betray the trust people place in them, Balkin urges that we draw on principles of fiduciary obligation."

"disrupt the emerging consensus by identifying a number of lurking tensions and ambiguities in the theory of information fiduciaries, as well as a number of reasons to doubt the theory’s capacity to resolve them satisfactorily. Although we agree with Balkin that the harms stemming from dominant online platforms call for legal intervention, we question whether the concept of information fiduciaries is an adequate or apt response to the problems of information insecurity that he stresses, much less to more fundamental problems associated with outsized market share and business models built on pervasive surveillance."