Main menu

Cooking with Onions: Finding the OnionBalance

This blog post is the first part of the Cooking with Onions series which aims to highlight various interesting developments on the .onion space. This particular post presents a technique for efficiently scaling busy onion services.

The need for scaling

Onion services have been around for a while. During the past few years, they have been deployed by many serious websites like major media organizations (like the Washington Post), search engines (such as DuckDuckGo) and critical Internet infrastructure (e.g. PGP keyservers). This has been a great opportunity for us, the development team, since our code has been hardened and tested by the sheer volume of clients that use it every day.

This recent widespread usage also gave us greater insights on the various scalability issues that onion service operators face when they try to take their service to the next level. More users means more load to the onion service, and there is only so much that a single machine can handle. The scalability of the onion service protocol has been a topic of interest to us for a while, and recently we've made advancements in this area by releasing a tool called OnionBalance.

So what is OnionBalance?

OnionBalance is software designed and written by Donncha O'Cearbhaill as part of Tor's Summer of Privacy 2015. It allows onion service operators to achieve the property of high availability by allowing multiple machines to handle requests for a single onion service. You can think of it as the onion service equivalent of load balancing using round-robin DNS.

How OnionBalance works

Consider Alice, an onion operator, who wants to load balance her overloaded onion service using OnionBalance.

She starts by setting up multiple identical instances of that onion service in multiple machines, makes a list of their onion addresses, and passes the list to OnionBalance. OnionBalance then fetches their descriptors, extracts their introduction points, and publishes a "super-descriptor" containing all their introduction points. Alice now passes to her users the onion address that corresponds to the "super-descriptor". Multiple OnionBalance instances can be run with the same configuration to provide redundancy when publishing the super descriptor.

When Bob, a client, wants to visit Alice's onion service, his Tor client will pick a random introduction point out of the super-descriptor and use it to connect to the onion service. That introduction point can correspond to any of the onion service instances, and this way the client load gets spread out.

With OnionBalance, the "super-descriptor" can be published from a different machine to the one serving the onion service content. Your onion service private key can be kept in a more isolated location, reducing the risk of key compromise.

Millions of users beg to differ, and the only thing that is dead is democratic ideals based on the actions of three letter agencies gone wild.

We should welcome a battle-scarred Tor network and browser being attacked constantly by the Feds, as it just pushes developments further e.g. onion balancing, increased scaling of the onion network, greater adoption by websites of .onions that solve the CA debacle, quantum computer-resistant cypher development, (eventual) padding of Tor traffic and more.

Tor and Tor browser are not going anywhere, and potential competitors run a distant second in terms of user population and actual functionality.

If the retards running major internet infrastructure and websites actually invested heavily in the Tor network and shifted operations to .onions, the Tor network can be the blueprint for the new and vastly improved internet that is required, because the http/https model right now is f**ked security-wise.

To extend your response to the OP:
Let us not forget that most if not all of the LE hacks we know about have been compromises on endpoint security (compromising the server at the application level, exploiting browsers and distributing a malicious payloads, or often both), and not at the transport layer. The biggest compromise of the Tor protocol I can think of off the top of my head was the CMU hack that abused the RELAY_EARLY flag. That one was serious indeed, and did result in some arrests, but these kinds of attacks seem relatively rare so far.

I believe that increasing pressure will be placed on relay operators. They will become more liable for the traffic they carry. It may even become a crime to knowingly help others to remain anonymous.

I already carry a biometrically verified identification card with a number to use in online transactions. I believe it is planned that that number will be used as a login to access the Internet from within my country.

Unique, trackable, and revocable licenses that are necessary to get online are coming.

Interestingly enough, some of the deleted comments quoted from GCHQ troll manuals leaked by Snowden, which show that short repetitive FUD comments which offer no support for their claims that "resistance is useless" [sic], are standard practice by GCHQ comment trolls.

Other deleted comments explored the habits of RU and CN funded comment trolls.

Some days it seems that the only point on which FVEY, RU, CN agree is that they insist on doing all the thinking for their own citizens, they reject any attempt by "the rabble" to try to define themselves, and especially any attempt to express dissent in an effective manner which might actually reduce the power and influence of the socioeconomic-poltical elite.

Yeah, I note there is a kind of Murphy's law (I can't find its name) along the lines of "The effort to undo damage is greater than the effort taken to cause it." It can takes ages to repair the damage caused by very short acts of shitposting. Here, I just saw a very quick way to reverse the intent of the original shitpost. Not rational, but effective. Shame a long comment interposed itself, mitigating the impact ...

I suspect that one was from @movrcx himself; his tweets were saying similar things at the time.

On the other hand, if it is NSA/TAO operators, what a doss of a job! Maybe I'm missing my vocation: what's the pay like? Where do I apply!?

The output of SHA1 has a length of 160 bit. To make handling the URLs more convenient we only use the first half of the hash, so 80 bit remain. Taking advantage of the ​Birthday Attack, entropy can be reduced to 40 bit.

I think the birthday attack you described above only applies when you are trying to generate two arbitrary onion addresses that collide. That's not that strong of an attack, because it does not allow you to impersonate specific onion services.

Now, if you are trying to create a collision to a _specific_ onion address (targetted), the birthday attack argument does not work.

For the consensus to reject such a colliding (40-bits) hash, I assume the entire (80-bits) key hash (or the entire key itself?) is remembered by the directories on the "first claimed, first assigned" basis.

If they were remembered forever, then the entire onion directory would be pretty vulnerable to malicious overload. Right? So how long are the claims remembered?

Looking at Tor design documents, I failed to figure what would be an onion TTL (Time To Live) in the consensus. I only found about the requirement for the onion operator to keep publishing on a regular interval (recommended at least every 20 hours, if I remember correctly).

Does it also imply that, whenever failing to do so, the "40-bit hash lease" of the onion would end up being forgotten and as such could be obtained again (malicously or not doesn't really matter here), by such a colliding (40-bit) hash key, and as fast as a couple of days after the original publisher would cease to do so?

If the OP fails to do so in time, I understand the onion's public key could entirely disappear from the consensus as soon as 6 hours after the deadline. And therefore, the (80-bit) onion address could theorically be "granted" again to an entirely different public key with a colliding onion address, from that moment and as long as the new one would keep being published.

I understand such a collision is considered "unlikely" (malicious or not) due to the 80-bit entropy. Still, is this correct?

Do directories already implement any mechanism to detect, remember and publish a colliding address (at least to clients querying one)? Has any one ever been detected, or reported manually, to date?

I am no expert, but I am starting to hope that if scaling issues can be miraculously solved without compromising security, onion services could perhaps solve the horrid problems with CAs (cf the grotesque abuses which the spooks chortle about at ISS World conferences).

Making (standard) online purchases requires identifying information, for that matter you can't use it via the Tor network. Your best bet is to look at crypto-currencies such as Bitcoins, which would allow for such (standard) online purchases if supported.

Alas, Tor Project cannot do much to alter the tendency of many website maintainers to reflexively ban all Tor nodes, or to otherwise discourage their customers from using Tor. In some countries, websites may also be struggling to obey government mandates to block Tor.

You were probably talking about electronic currency payments, but if you are are trying to send credit card information over an unencrypted connection (from exit node to the destination website), that could be dangerous.

It looks like a sabotage. Onion services are only needed if both users and servers need anonymity. The ones who use lots of 1-hop onions are easy to anonymize. And anonymizing services reduces anonymity of users. Why don't you make Tor to mark such servers as less secure and why don't you discourage their use? If you are not about anonymity, and privacy, then please stop claim so.

Recent Updates

Hi! There's a new alpha release available for download. If you build Tor from source, you can download the source code for 0.3.3.2-alpha from the usual place on the website. Packages should be available over the coming weeks, with a new alpha Tor Browser release some time in February.

Remember, this is an alpha release: you should only run this if you'd like to find and report more bugs than usual.

Tor 0.3.3.2-alpha is the second alpha in the 0.3.3.x series. It introduces a mechanism to handle the high loads that many relay operators have been reporting recently. It also fixes several bugs in older releases. If this new code proves reliable, we plan to backport it to older supported release series.

Changes in version 0.3.3.2-alpha - 2018-02-10

Major features (denial-of-service mitigation):

Give relays some defenses against the recent network overload. We start with three defenses (default parameters in parentheses). First: if a single client address makes too many concurrent connections (>100), hang up on further connections. Second: if a single client address makes circuits too quickly (more than 3 per second, with an allowed burst of 90) while also having too many connections open (3), refuse new create cells for the next while (1-2 hours). Third: if a client asks to establish a rendezvous point to you directly, ignore the request. These defenses can be manually controlled by new torrc options, but relays will also take guidance from consensus parameters, so there's no need to configure anything manually. Implements ticket 24902.

Major bugfixes (netflow padding):

Stop adding unneeded channel padding right after we finish flushing to a connection that has been trying to flush for many seconds. Instead, treat all partial or complete flushes as activity on the channel, which will defer the time until we need to add padding. This fix should resolve confusing and scary log messages like "Channel padding timeout scheduled 221453ms in the past." Fixes bug 22212; bugfix on 0.3.1.1-alpha.