Share this story

Since Edward Snowden stepped into the limelight from a hotel room in Hong Kong three years ago, use of the Tor anonymity network has grown massively. Journalists and activists have embraced the anonymity the network provides as a way to evade the mass surveillance under which we all now live, while citizens in countries with restrictive Internet censorship, like Turkey or Saudi Arabia, have turned to Tor in order to circumvent national firewalls. Law enforcement has been less enthusiastic, worrying that online anonymity also enables criminal activity.

Enlarge/ Who can forget the now-famous "Tor stinks" slide that was part of the Snowden trove of leaked docs.

Cracks are beginning to show; a 2013 analysis by researchers at the US Naval Research Laboratory (NRL), who helped develop Tor in the first place, concluded that "80 percent of all types of users may be de-anonymised by a relatively moderate Tor-relay adversary within six months."

Despite this conclusion, the lead author of that research, Aaron Johnson of the NRL, tells Ars he would not describe Tor as broken—the issue is rather that it was never designed to be secure against the world’s most powerful adversaries in the first place.

"It may be that people's threat models have changed, and it's no longer appropriate for what they might have used it for years ago," he explains. "Tor hasn't changed, it's the world that's changed."

New threats

Tor's weakness to traffic analysis attacks is well-known. The original design documents highlight the system's vulnerability to a "global passive adversary" that can see all the traffic both entering and leaving the Tor network. Such an adversary could correlate that traffic and de-anonymise every user.

But as the Tor project's cofounder Nick Mathewson explains, the problem of "Tor-relay adversaries" running poisoned nodes means that a theoretical adversary of this kind is not the network's greatest threat.

"No adversary is truly global, but no adversary needs to be truly global," he says. "Eavesdropping on the entire Internet is a several-billion-dollar problem. Running a few computers to eavesdrop on a lot of traffic, a selective denial of service attack to drive traffic to your computers, that's like a tens-of-thousands-of-dollars problem."

At the most basic level, an attacker who runs two poisoned Tor nodes—one entry, one exit—is able to analyse traffic and thereby identify the tiny, unlucky percentage of users whose circuit happened to cross both of those nodes. At present the Tor network offers, out of a total of around 7,000 relays, around 2,000 guard (entry) nodes and around 1,000 exit nodes. So the odds of such an event happening are one in two million (1/2000 x 1/1000), give or take.

Further Reading

But, as Bryan Ford, professor at the Swiss Federal Institute of Technology in Lausanne (EPFL), who leads the Decentralised/Distributed Systems (DeDiS) Lab, explains: "If the attacker can add enough entry and exit relays to represent, say, 10 percent of Tor's total entry-relay and exit-relay bandwidth respectively, then suddenly the attacker is able to de-anonymise about one percent of all Tor circuits via this kind of traffic analysis (10 percent x 10 percent)."

"Given that normal Web-browsing activity tends to open many Tor circuits concurrently (to different remote websites and HTTP servers) and over time (as you browse many different sites)," he adds, "this means that if you do any significant amount of Web browsing activity over Tor, and eventually open hundreds of different circuits over time, you can be virtually certain that such a poisoned-relay attacker will trivially be able to de-anonymise at least one of your Tor circuits."

For a dissident or journalist worried about a visit from the secret police, de-anonymisation could mean arrest, torture, or death.

As a result, these known weaknesses have prompted academic research into how Tor could be strengthened or even replaced by some new anonymity system. The priority for most researchers has been to find better ways to prevent traffic analysis. While a new anonymity system might be equally vulnerable to adversaries running poisoned nodes, better defences against traffic analysis would make those compromised relays much less useful and significantly raise the cost of de-anonymising users.

The biggest hurdle? Despite the caveats mentioned here, Tor remains one of the better solutions for online anonymity, supported and maintained by a strong community of developers and volunteers. Deploying and scaling something better than Tor in a real-world, non-academic environment is no small feat.

What Tor does really well

Tor was designed as a general-purpose anonymity network optimised for low-latency, TCP-only traffic. Web browsing was, and remains, the most important use case, as evidenced by the popularity of the Tor Browser Bundle. This popularity has created a large anonymity set in which to hide—the more people who use Tor, the more difficult it is to passively identify any particular user.

But that design comes at a cost. Web browsing requires low enough latency to be usable. The longer it takes for a webpage to load, the fewer the users who will tolerate the delay. In order to ensure that Web browsing is fast enough, Tor sacrifices some anonymity for usability and to cover traffic. Better to offer strong anonymity that many people will use than perfect anonymity that's too slow for most people's purposes, Tor's designers reasoned.

"There are plenty of places where if you're willing to trade off for more anonymity with higher latency and bandwidth you'd wind up with different designs," Mathewson says. "Something in that space is pretty promising. The biggest open question in that space is, 'what is the sweet spot?'

"Is chat still acceptable when we get into 20 seconds of delay?" he asks. "Is e-mail acceptable with a five-minute delay? How many users are willing to use that kind of a system?"

Mathewson says he's excited by some of the anonymity systems emerging today but cautions that they are all still at the academic research phase and not yet ready for end users to download and use.

Ford agrees: "The problem is taking the next big step beyond Tor. We've gotten to the point where we know significantly more secure is possible, but there's still a lot of development work to make it really usable."

'Ford agrees: "The problem is taking the next big step beyond Tor. We've gotten to the point where we know significantly more secure is possible, but there's still a lot of development work to make it really usable."'

There's another issue here. When Tor was created, it not only took ten years to make it from conception to usable by normal human beings, but to make it well funded and to work out a *lot* of security issues that were discovered (and continue to be uncovered by the world's security research community). Those came from scaling, new releases, all kinds of things.

And that's normal and desirable -- it means Tor was attracting more good people who were willing to hammer the hell out of the software for free and give patches and clue to the nonprofit -- and those researchers' normal hourly would be astronomical if they were consultants. That's the miracle of FOSS, loaves and fishes, in the security context in particular, right? Something the MSM and business press tends to gloss.

No academic/research model of security software developed on a drawing table survives implementation and scale as a secure system. They all iterate, but some will end up collapsing.

Some of these contributions will end up as great contributions to the opsec of those who require privacy. Some of them may end up as discarded needles in a Haystack, which we hope we'll never end up finding again, once they are tested against actual threat environments. This isn't a diss against the organizations or programmers, it's just unknowable until the rubber hits the road.

I wish them all the best, to them and from them -- this is an environment where we need more choices, but less attitude. What we really need is better funding overall, and more awareness, and more maturity. With more resources and discretion, we can do better for the entire world, in such an important field.

In my followup reading I did find something not mentioned in the article which helped me better understand Dissent and related protocols. Namely, one key way they differ in network topology compared to a naive DC-net protocol. In a naive DC-net with n users every user must set up and maintain a secret with all other n-1 users. That means the network topology of these secrets is a complete graph, and the total number of secrets that must be maintained is the same as the number of edges in the graph, n(n-1)/2, so the overall complexity is O(n^2). Thus every device needs needs to talk to every other device at least once, and adding/removing devices to the network is not trivial. If setting up the network in the first place isn't scalable it doesn't even matter if broadcasting is efficient or not.

What the DC-net inspired protocols discussed here do is implement a server/client topology. (Dissent calls this "anytrust architecture.") Here there are no shared secrets between clients, but each client has a shared secret with m << n servers, vastly reducing the effort required to initialize and maintain secrets. Moreover, this setup can be shown to maintain anonymity between any two clients as long at least one server with which they each maintain a secret is honest. So the complete graph topology maximizes anonymity, but the server/client model is nearly as good unless almost all servers are malicious.

Compare that to Tor, where, as discussed in the article, a few malicious relays means at least some users are always being deanonymized (even if you can't choose which ones), while many malicious relays means this applies to a significant fraction of users.

In summary, it's not enough to have sufficiently scalable broadcasting bandwidth/latency, the network topology must also allow for efficient initialization. In this regard the client/server modification of DC-nets is a massive improvement over global user-to-user secrets.

Edit: Removed a claim about how servers are added or removed that I'm not actually sure is true.

I'd be interested in statistics on the type of usage for Tor. Sure, political dissent and simple desire for anonymity is one, but how much traffic on Tor is related to illegal activities?

How much democratic freedom are you willing to risk/give up at the cost of blocking, if not preventing, illegal or quasi-legal activities? The ethical conundrum heads down a very deep rabbit hole very quickly.

It does, it just doesn't automatically kick me there when I log in like it does on the US site. And since the US site does, I rarely think to change it over myself.

The typical sequence looks like this:

1. Visit and log in to Ars US -> auto redir to HTTPS.2. Click an Ars UK link -> Ars US login not recognized -> kicked down to HTTP.3. Log in to Ars UK -> No redir to HTTPS.4. Forget to manually change over to HTTPS.

Much bigger user base; much more visibility in the academic and hacker communities; benefits from formal studies of anonymity, resistance, and performance; has a non-anonymous, visible, university-based leaderHas already solved some scaling issues I2P has yet to addressHas significant fundingHas more developers, including several that are fundedMore resistant to state-level blocking due to TLS transport layer and bridges (I2P has proposals for "full restricted routes" but these are not yet implemented)Big enough that it has had to adapt to blocking and DOS attemptsDesigned and optimized for exit traffic, with a large number of exit nodesBetter documentation, has formal papers and specifications, better website, many more translationsMore efficient with memory usageTor client nodes have very low bandwidth overheadCentralized control reduces the complexity at each node and can efficiently address Sybil attacksA core of high capacity nodes provides higher throughput and lower latencyC, not Java (ewww)

Benefits of I2P over Tor

Designed and optimized for hidden services, which are much faster than in TorFully distributed and self organizingPeers are selected by continuously profiling and ranking performance, rather than trusting claimed capacityFloodfill peers ("directory servers") are varying and untrusted, rather than hardcodedSmall enough that it hasn't been blocked or DOSed much, or at allPeer-to-peer friendlyPacket switched instead of circuit switchedimplicit transparent load balancing of messages across multiple peers, rather than a single pathresilience vs. failures by running multiple tunnels in parallel, plus rotating tunnelsscale each client's connections at O(1) instead of O(N) (Alice has e.g. 2 inbound tunnels that are used by all of the peers Alice is talking with, rather than a circuit for each)Unidirectional tunnels instead of bidirectional circuits, doubling the number of nodes a peer has to compromise to get the same information.Protection against detecting client activity, even when an attacker is participating in the tunnel, as tunnels are used for more than simply passing end to end messages (e.g. netDb, tunnel management, tunnel testing)Tunnels in I2P are short lived, decreasing the number of samples that an attacker can use to mount an active attack with, unlike circuits in Tor, which are typically long lived.I2P APIs are designed specifically for anonymity and security, while SOCKS is designed for functionality.Essentially all peers participate in routing for othersThe bandwidth overhead of being a full peer is low, while in Tor, while client nodes don't require much bandwidth, they don't fully participate in the mixnet.Integrated automatic update mechanismBoth TCP and UDP transportsJava, not C (ewww)

Other potential benefits of I2P but not yet implemented

...and may never be implemented, so don't count on them!

Defense vs. message count analysis by garlic wrapping multiple messagesDefense vs. long term intersection by adding delays at various hops (where the delays are not discernible by other hops)Various mixing strategies at the tunnel level (e.g. create a tunnel that will handle 500 messages / minute, where the endpoint will inject dummy messages if there are insufficient messages, etc)

I'd be interested in statistics on the type of usage for Tor. Sure, political dissent and simple desire for anonymity is one, but how much traffic on Tor is related to illegal activities?

How much democratic freedom are you willing to risk/give up at the cost of blocking, if not preventing, illegal or quasi-legal activities? The ethical conundrum heads down a very deep rabbit hole very quickly.

I can't really say without knowing what the statistics are. If 99% of Tor traffic is used for drug trade, human smuggling, and kiddie porn I'm not sure it would be worth it.

I pretty much assume that of any and all projects like this that come from organizations that use public/government funding for anything. Either they threaten them with funding cuts or bribe them with a huge funding increase. You really can't totally trust any of this stuff to be 100% secure anymore.

That number of Tor nodes is supripsingly small. 7000 relays, 2000 entry and 1000 exit points? I bet NSA can spawn 10k servers on a wiff.

Also I think it's time to help. Does anybody know of an easy way to run a cheap digitalocean or scaleway tor node? Something I can start and forget, maybe check system updates once in a while and just let it contribute to the scale of the problem?

Ars is a interesting bunch. They ... are asking the government to do the same job they did in the past EXCEPT be completely anemic in the way they surveil people. Back in the days of phone communication you could 100% effectively monitor phones if you had a warrant.

The fuck...?Back in the days of landlines, you could not tell who had a conversation with whom at the diner nor what they talked about. Now their cell phones can be tracked, and even the microphones fired remotely, so LEO HAS VASTLY GREATER POWER TO SURVEIL THAN EVER BEFORE.

Le Blond reports that his team has implemented a working Herd prototype at MPI-SWS, and together with their colleagues from Northeastern University in the United States, have just raised half-a-million dollars from the US National Science Foundation to deploy Herd, Aqua, and other anonymity systems over the next three years. With funding in hand, Le Blond hopes to see the first Herd nodes online and ready for users in 2017.

So the new anonymous voice-over-IP platform that will "protect" users from government surveillance is being funded by the government? Alrighty then!

Alpenhorn checks in at just over 3,000 lines of Go. At scale, the network promises latency of ~37 seconds per message, assuming around a million simultaneous users, with a throughput of 60,000 messages per second.

I don't understand the math here.Is that a 37 second latency per message?Or is that a 37ms latency per message?

I'd be interested in statistics on the type of usage for Tor. Sure, political dissent and simple desire for anonymity is one, but how much traffic on Tor is related to illegal activities?

Define "illegal activities". What is illegal in one jurisdiction may not be illegal in another. For example, if a Thai uses Tor to insult the Thai monarchy, that's an "illegal activity", according to Thai law.

And what about "illegal activities" that millions of us don't want to be illegal, such as selling cannibis? If people can use Tor to avoid being arrested for doing that, then I'm all for it.

If you're referring to child porn, then this article says it may be 80% of hidden service traffic, which the Tor project says is about 2% of all Tor traffic. In other words, not very much.

I'd be interested in statistics on the type of usage for Tor. Sure, political dissent and simple desire for anonymity is one, but how much traffic on Tor is related to illegal activities?

How much democratic freedom are you willing to risk/give up at the cost of blocking, if not preventing, illegal or quasi-legal activities? The ethical conundrum heads down a very deep rabbit hole very quickly.

I can't really say without knowing what the statistics are. If 99% of Tor traffic is used for drug trade, human smuggling, and kiddie porn I'm not sure it would be worth it.

It's worth knowing at least.

Ars is a interesting bunch. They want:

1. Personal drive encryption to be sacrosanct. They want it unconstitutional to be forced to decrypt a drive.2. They want a high bar standard before a government agency can watch or "spy" on an individual and strict and strong limits on how that "spying" can take place. 3. Digital means of communication that can not be breached even by a warrant (isn't that what this article is about hardening a way of communication so that even if there was a warrant you could not monitor what a person is doing?). 4. Oh, and for the government to get the bad guys.

Basically they are asking the government to do the same job they did in the past EXCEPT be completely anemic in the way they surveil people. Back in the days of phone communication you could 100% effectively monitor phones if you had a warrant. If Ars gets its wish list of wants for digital privacy, any criminal who uses a hardened tor, encrypts a drive and is smart about hardened cell phone communication will be a difficult nut to crack.

Hold my government accountable for mass surveillance? <Gasp!> how dare I? Yes I want to have private communication and I want to be able to encrypt or destroy my information without being forced to reveal it (much like burning a letter). That unsavory elements of society are able to use the same system to cause issue for law enforcement is a unfortunate side effect. I do not trust someone else with absolute control over my data to use it responsibly, even (perhaps especially) the government.

The FBI Director recently stated that he want's to have a "adult conversation" about encryption next year (rather patronizing, don't you think?). I can imagine what his conversation is going to consist of: more rhetoric about how we can't catch the bad guys without being able to access the good guy's data.

Could we possibly stop using 'aqua' in the context of supposedly-secure anonymizing service? That makes me think of Project Aquinas from Deus Ex, where something meant to provide free bandwidth and anonymity in fact gave Majestic 12 total informational surveillance of the entire globe.

This is the first time I've heard of differential privacy since Apple's WWDC. I bet Apple might be interested in that particular project. That could be a way to take the lead from Academia where they run out of gas, and turn it into a real product. Google I would think would not be in favor of products that allow for anonymity online. That's the opposite of their business model. Correct me if I'm wrong about that.