Interesting...Its like node watermarking....
>
> As a side note, this offers some interesting possibilites for using multiple
> passphrases and tracking where they propagate through the network.
> This might be a weakness, or a method of excorcising rogue peers based
> on which passphrases they used, and subsequently which parts of the
> network might be compromised...
>
> Any thoughts?
>
--
Justin Chapweske, Onion Networks
http://onionnetworks.com/

I was thinking about initial introduction in Tristero-China and other filter
avoidance networks. This is done out-of-band to prevent the abuse of
an initial introduction mechanism by the _evil party_ to locate and
target relay nodes.
Right now most systems only require and IP and Port for initial
introduction, which is provided out of band, however, there is nothing
to prevent a malicious entity (China Gov?) from simply scanning
entire subnets looking for relays. They would attempt a handshake on
a port, and if they connected, they found a relay, and they could then
shut it down, arrest the user, etc.
To protect against this each node could require a "passphrase" to be
included in the initial introduction connection request. This passphrase
or key/token would be different for each node, and passed around out
of band like the IP and Port of the node itself.
In this manner you can avoid the subnet scanning vulnerabilities while
keeping the initial introduction simple and secure.
One problem this creates is during transitive introduction, for example
when a node requests their subset of peers from an AChord index. The
passphrase must then be provided during transitive introduction in
addition to the IP and Port information. This should not be a big deal,
but something to account for during implementation.
As a side note, this offers some interesting possibilites for using multiple
passphrases and tracking where they propagate through the network.
This might be a weakness, or a method of excorcising rogue peers based
on which passphrases they used, and subsequently which parts of the
network might be compromised...
Any thoughts?
coderman@...

I've been thinking about Brandon's proposal. The problem it has is that
serves requests is based on the hash of the URL, which doesn't make a lot of
sense to me. By merging the Achord proposal with the Crowds stuff, I came up
with the following idea which improves upon the original by reducing
reliance on nodes who hash to certain URLs.
* Proposal
Get the nodes to form a Chord-like circle based on the hash of their IP.
When a node has a request to make, it picks a random node from the set it
knows and forwards it. That node then either fulfills the request (if it
can) or passes it on to another node it knows. This continues recursively
until the request is fulfilled and the response goes back the same way, with
potential caching along the way.
The Chord could be modified in such a way that a host behind the firewall
wouldn't actually join the Chord network. (This assumes nodes know whether
they're being blocked or not.) This would provide greater protection against
mapping the network, since nodes behind the firewall (assumedly nodes we are
most trying to protect) are simply not part of the network.
We could protect the nodes outside the firewall by limiting the number of
nodes that a user can discover to a small number (possibly one or two).
Since the system isn't based on hashing URLs, this shouldn't degrade
performance very much, but may cause a lot of load on some nodes.
This would also allow nodes behind the firewall to use the system without a
specialized client. All they would need to know is the address of one or
more (hopefully unblocked) nodes. Then the user could type the URLs of these
nodes into a standard web browser and get back a "please retrieve this page
for me" form, like the FProxy system.
* Issues
This system might be slower since the final node is unlikely to have the
requested page in its cache. On the other hand, it'll be faster to get to
the final node since we flip coins instead of routing. Also, it will be
faster/safer in the case that the final node is overloaded, or worse, an
enemy.
Hope this is helpful,
--
[ "Aaron Swartz" ; <mailto:me@...> ; <http://www.aaronsw.com/&gt; ]

I've been thinking about Brandon's proposal. The problem it has is that
serves requests is based on the hash of the URL, which doesn't make a lot of
sense to me. By merging the Achord proposal with the Crowds stuff, I came up
with the following idea which improves upon the original by reducing
reliance on nodes who hash to certain URLs.
* Proposal
Get the nodes to form a Chord-like circle based on the hash of their IP.
When a node has a request to make, it picks a random node from the set it
knows and forwards it. That node then either fulfills the request (if it
can) or passes it on to another node it knows. This continues recursively
until the request is fulfilled and the response goes back the same way, with
potential caching along the way.
The Chord could be modified in such a way that a host behind the firewall
wouldn't actually join the Chord network. (This assumes nodes know whether
they're being blocked or not.) This would provide greater protection against
mapping the network, since nodes behind the firewall (assumedly nodes we are
most trying to protect) are simply not part of the network.
We could protect the nodes outside the firewall by limiting the number of
nodes that a user can discover to a small number (possibly one or two).
Since the system isn't based on hashing URLs, this shouldn't degrade
performance very much, but may cause a lot of load on some nodes.
This would also allow nodes behind the firewall to use the system without a
specialized client. All they would need to know is the address of one or
more (hopefully unblocked) nodes. Then the user could type the URLs of these
nodes into a standard web browser and get back a "please retrieve this page
for me" form, like the FProxy system.
* Issues
This system might be slower since the final node is unlikely to have the
requested page in its cache. On the other hand, it'll be faster to get to
the final node since we flip coins instead of routing. Also, it will be
faster/safer in the case that the final node is overloaded, or worse, an
enemy.
Hope this is helpful,
--
[ "Aaron Swartz" ; <mailto:me@...> ; <http://www.aaronsw.com/&gt; ]

FYI, there is a project, under the auspices of the cDc, called
"hacktivismo" or "Project X", which acts as an SSL mixnet
HTTP proxy. I haven't been following the progress in detail
for several months, but it is consistently progressing, to the
best of my knowledge.

The basic idea is to proxy HTTP connections through hosts which can
bypass the firewall. This problem has a couple of parts. First of all,
you need to be able to quickly find a node which is accessible by you
(i.e. not blacklisted) but also able to reach the desired host (i.e. on
the other side of the firewall). Also, you do not want the enemy, who
may also be running a node, to get the master list of all nodes as it
could then add them all to the blacklist.
So, what we need is an unmappable network (i.e. you can only see part of
the network and have trouble finding out about the whole network) which
also quickly lets us find a node with a particular property (it's on the
other side of the firewall).
My best suggestion for this is the Anonymized Chord system. I'm not
going to going it and how it differs from MIT Chord as that is in a
different paper also currently being written. So I'll just describe it's
properties. Anonymized Chord has approximately the same properties as
Chord with the added property that it is difficult to find out about
nodes other than your "ideal set". Your ideal set of nodes is the 160
nodes in the network that most closely match the 160 slots your node has
to fill in its address table. The slots are algorithmically generated
from a node's IP, so it is evident when talking to any node what its
ideal set is. Since it is difficult to harvest IPs which are not in your
ideal set, harvesting a large number of IPs requires running a large
number of nodes, each on a different IP. Since Freenet apparently works
in practice and this is a more difficult attack than would be necessary
to accomplish the same task with Freenet, this system should
sufficiently defend against mapping the network.
The useful property of a Chord network being used in this system is that
it allows for key to be looked up in O(n), so it scales very well to a
lot of nodes. The questions which remains to be answered is how you can
generate a key which will lead you to a node on the other side of the
firewall.
Typically, a key is a 160-bit SHA1 hash. The network routes to the node
whose IP's SHA1 hash most closely matches the key out of all of the
nodes in the network.
My suggestion is to make each node an HTTP proxy in which the URL being
requested is the hashed to generate the key. Of course, this is not a
complete system, but it does have a useful property. It distributes the
load for various URLs among different nodes. It even distributes
multiple requests to different URLs on the same host to different nodes,
aiding slightly in defeating traffic analysis.
One problem with this scheme is that a very popular URL, for instance
http://www.cnn.com/, will have all of its traffic proxied through a
single node. This could be a lot of load on a node. However, this is
easily fixed by making each proxy into a caching proxy. In fact, an
existing caching proxy could be used as the basis for this system as the
only difference in this system is that instead of contacting the host
directly it might request it from another proxy instead. The semantics
of caching, expiry, etc. have all already been worked out, except for
one problem. How do you make sure that none of the caches mess up the
data? That is an orthogonal problem of how to have untrusted caches and
will be dealy with separately.
One thing is still missing from this system. While it can find a node to
use as a proxy, it doesn't find a node which is on the other side of the
firewall to use as a proxy. Somehow a key must be generated which leads
you to the other side of the firewall. To describe the solution, to
this, I must first describe the concept of a jurisdiction.
A jurisdiction is a set of IPs with a particular set of restrictions on
what sites they can contact. Here are some example jurisdictions: China,
Saudi Arabia, USA, Latvia, Sealand. The goal is to find a node in the
same jurisdiction as the site you want to request. If the node which
requests the site is in the same jurisdiction as the site then it should
be able to get to it.
So obviously in order for this system to work the jurisdiction needs to
be in the key. If the jurisdiction is encoded in the most significant
byte of the key then Chord routing will quickly hop through the network
of nodes to the proper jurisdiction and from there proceed to find the
node assigned to the particular URL, for purpose of load balancing.
The problem of putting the jurisdiction in the keys has several parts.
First of all, each node has to be assigned a jurisdiction. So there
needs to be a master list of jurisdictions and what bits they correspond
to. I suggest allocating one byte of the key to jurisdictions, giving
256 possible ones. Each node can set its jurisdiction byte when the
software is installed.
Next, a jurisdiction byte must be assigned to each key being requested.
So when a user requests a URL a jurisdiction byte needs to be tacked on
in order for the proxy to work properly.
There are several ways to generate the jurisdiction byte for the URL.
First of all, the user can select which jurisdiction they are browsing
in the UI that they use to control the proxy. With a properly designed
UI, this could be made so that it's not that much of a pain. Also,
people tend to generally browse things in one jurisdiction at a time.
Another method would be to look up host names in a blacklist file
maintained by a central authority.
There is also the possibility to use the services which map IPs to
geographical locations. The problem with these services is that they
generally require access to a centralized and non-free database. Access
to those services could be blocked.
Anyway, this has gotten too long and so I'll just go ahead and post it
and then follow up later with some clarifications.

The goal is to allow people to view banned websites, particularly in
cases such as China where it is not just an illicit activity but
actually technically challenging due to a firewall blocking traffic to
banned sites. The goal is to find a technical solution to this problem
which is actually deployable in the real world.
Discussion I have had with Freenet users in China and the people behind
Red Rover has yielded some insight into the practical possibilities
regarding firewall evasion in China. I'd be interested in hearing what
anyone might know regarding how other restrictive legal jurisdictions
differ from China.
The first important point is that China (and indeed most censoring
firewalls) works with blacklists. So sites that move around a lot can
evade being blocked. Of course if the channel advertising their current
locations is publically know then it will either get added to the
blacklist or else the firewall maintainers will subscribe to the channel
and continuously update the blacklist.
Another important point is that people are using Freenet in China right
now and it works for them. So while there are a lot of theoretical
issues with potential systems such as whether or not encryption is okay
and how hard it will be to get the software into the firewalled area we
can at least for the moment ignore them. If Freenet is working for the
people then any system with Freenet's problems may still work for the
people.
However, having talked to some Chinese Freenet users it is obvious that
Freenet is not entirely suitable. They want to be able to browse
arbitrary websites, not just ones hosted in Freenet specifically.
So, taking the knowledge that I have about what can actually be deployed
in China, I have designed a system which I think will most perfectly
suit the actual needs of the people attempting to circumvent the Chinese
firewall and also anyone else with similar constraints attempting to
access blocked content.