Posted
by
Zonkon Saturday December 30, 2006 @02:49AM
from the i-seee-you dept.

Virtual_Raider writes "Wired is carrying a story about a method developed by security researchers to identify computers hiding behind anonymity services. From the article: 'His victim is the Onion Router, or "Tor" — a sophisticated privacy system that lets users surf the web anonymously. Tor encrypts a user's traffic, and bounces it through multiple servers, so the final destination doesn't know where it came from. Murdoch set up a Tor network at Cambridge to test his technique, which works like this: If an attacker wants to learn the IP address of a hidden server on the Tor network, he'll suddenly request something difficult or intensive from that server. The added load will cause it to warm up.'"

The temp increase is the method to cause the clock to skew as the chip heats up due to added server load. The heat itself is not detected, so the summary is very misleading. The idea is to load the server enough so that the timestamps begin to change, and these changes can be detected.

Of course, the defense to this attack is probably something along the lines of:

First, if the computer is sensibly cooled (ie: not by convection currents) then heating will be minimal.

Second, if you use a high-precision clock-chip, the chip will be tens or hundreds of times more accurate than the system time, so the drift will be entirely absorbed through the loss of accuracy.

Third, a defender worried about such an attack would use an oven-controlled oscillator for the clock, which means the temperature is whatever you want it to be. You can deliberately vary it to produce errors, or compensate for external temperature changes. Either way, you can be quite invisible to this method.

Fourth, the TOR network should be using an external time source (eg: NTP) that is not included in the TOR tunnel - ie: it's out-of-band - which means that the computers can automagically correct drift. If the computers are REALLY good, they'd correct drift on a second-order or third-order basis, rather than as a constant, so that you adapt how you read the clock to the shift in drift.

The idea of using some sort of timing attack against such a network is interesting. There are probably better methods, though.

One idea that springs to mind is that such P2P systems use caches. If you could generate enough requests to flood the cache system, you can force any computer to query nearby computers, where the latency will be roughly equal to the number of hops along the critical path. It then becomes similar to the game of "Black Box", where you try to map particles by throwing rays in and seeing what happens. If you have a sufficiently large latency map from a sufficiently large number of entrance points, you should be able to derive the whole of the exposed topology of the P2P network and be able to identify which of those servers carry what data.

(Think about it. Those of us in Open Source have all done reverse engineering, we have all tried to wrest the secrets of some black box we can't see the inside of, and eventually we have all succeeded in doing so. Our interpretation may not 100% match the internals literally, but they WILL 100% match the internals logically. And in the end, that's all that matters.)

First, if the computer is sensibly cooled (ie: not by convection currents) then heating will be minimal.

The computers I tested it with were normal desktop machines. They all had fans, and in some cases were thermostatically controlled. The differences in temperature were only 1–2 C, but that could be remotely detected.

Second, if you use a high-precision clock-chip, the chip will be tens or hundreds of times more accurate than the system time

One idea that springs to mind is that such P2P systems use caches. If you could generate enough requests to flood the cache system, you can force any computer to query nearby computers, where the latency will be roughly equal to the number of hops along the critical path. It then becomes similar to the game of "Black Box", where you try to map particles by throwing rays in and seeing what happens. If you have a sufficiently large latency map from a sufficiently large number of entrance points, you should be able to derive the whole of the exposed topology of the P2P network and be able to identify which of those servers carry what data.

Nice idea, but it wouldn't work on Tor. The topology of the router network depends on who is using it, as routing paths are decided by the machines using the Tor network to remain anonymous, not by the routers themselves. In the case of a hidden service on Tor, a directory server is used to associate a.onion TLD with several routing paths the clients can use to contact to the server. Little information can be derived from the routing paths themselves, as the address of each router in the sequence is encr

Probing a theoretically ideal anonymous P2P network can be done if certain conditions are met. Step through the following and see what you think.

If you connect to each and every R in the set of R' (the total number of edge-connected nodes in the network) and cache-flood prior to making your standard test query, and repeat the process R'/3 times, then the mean value of M over all these tests will be the maximum possible radius of the topology. It doesn't matter what test query you use or where in the topol

There will be exactly one minimal fit for these conditions and this will be the topology of the network.

This won't work on Tor, for three reasons. First, there is no overall network topology. The routers merely act on routing instructions passed onto them via the client; they don't make connections autonomously, like, for instance, the nodes in a Gnutella network would do. Second, the hidden servers are not actually part of the Tor network; the routers merely act as middle men, stopping direct communication between the server and the client. Thirdly, I'm not aware of any caching that goes on between router

You don't need to know every complete path, so the number of possible permutations is something you can work around. Think of the tables used by the nudes for routing as one gigantic divided secret. It is possible to prove that for a divided secret, you need only know one part more than 1/3 of all the parts before the secret is weak enough to be considered compromised. The question, then, is purely one of how to gain access to these tables.

Think of the tables used by the nodes for routing as one gigantic divided secret. It is possible to prove that for a divided secret, you need only know one part more than 1/3 of all the parts before the secret is weak enough to be considered compromised. The question, then, is purely one of how to gain access to these tables.

I don't think that's quite right. If there are 3 pieces of data encrypted with three different keys, then knowing what one of those pieces of information is doesn't necessarily help figure out what the other two pieces are.

Tor is pleasingly clever in the way it goes about ensuring anonymity. Each router in the Tor network publishes its IP address and public key on a directory server. The client picks a random sequence of router addresses, R1 to RN, and corresponding public keys, P1 to PN. It then encrypts

"(Think about it. Those of us in Open Source have all done reverse engineering, we have all tried to wrest the secrets of some black box we can't see the inside of, and eventually we have all succeeded in doing so. Our interpretation may not 100% match the internals literally, but they WILL 100% match the internals logically. And in the end, that's all that matters.)"

In many ways I agree, but literal != logical. If I spoof the behaviour you look for, I could 'frame' another server for my processes. Log

You forgot the simplest one that will defeat all attempts at timestamp fingerprinting...

Lie about the time. As long as it monotonically increases between packets, and stays within a few
seconds of accurate, everything goes smoothly (for most general-purpose data traffic - Obviously
this would completely screw up something like an NTP query).

Lying about the time works to a degree, but you can only lie in a positive direction. Eventually, with enough tries, it may be possible to figure out that there is a value you NEVER go below. That is the maximum the real time can be. If it's too much below that, however, anything that is time-order dependent risks getting screwed up.

Now, this isn't to say you can't seriously screw with the network's perception of time. For example, you could channel bond multiple VPN connections into a single super-VPN co

Other potential solutions include preventing machines from reporting local time (through HTTP? - I'm not clear the attacker learns the time in the first place; neither TCP nor IP have time information in the headers, it seems) or preventing hidden servers from talking on the public Internet.

For most hidden services, either should be feasible. Timing doesn't seem that important anyway, given the inherent latency of the Tor network.

This memo presents a set of TCP extensions to improve performance
over large bandwidth*delay product paths and to provide reliable
operation over very high-speed paths. It defines new TCP options for
scaled windows and timestamps, which are designed to provide
compatible interworking with TCP's that do not implement the
extensions. The timestamps are used for two distinct mechanisms:
RTTM (Round Trip Time Measurement) and PAWS (Protect Against Wrapped

RFC1323 is not part of tcp/ip. It is an optional extension that some systems could choose to implement. A system does not have to implement these options.
Leave RFC1323 options turned off at the operating system level, and you won't reveal information about the system time keeping in that manner.

However, there is a possibility the TOR and other applications themselves reveal the timestamp, say the applications ordinarily include it in messages passed from one peer to another (or from server to client

What about always using 100% of your CPU? I run the BOINC [berkeley.edu] client for the Rosetta@HOME [bakerlab.org] project and tell it to crunch as much data as it can with idle CPU time. It is ALWAYS up and running. So, if I have this running on a machine that also uses Tor then the "create extra CPU load" method would fail.

Please explain exactly how the CPU will know what priority the scheduler has assigned to a given process.

I don't specifically know "how". What I do know is that it's a fact, and quite easy to demonstrate.

Start a long-running CPU-intensive program (my preference is mencoder) at a low priority, and monitor the CPU tempurature. After you've given it plenty of time to cool down (a day or more if you like) start the program at the default or higher priority, and you'll see the tempurature is significantly high

Have a look at this blog posting [lightbluetouchpaper.org] for why adding random noise will not prevent the attack. Essentially, random noise doesn't change the average skew, since the computer doesn't have an independent reference clock. By taking a moving average over time, the noise can be detected and removed.

Randomizing the clock of systems serving Tor traffic would render this attack worthless.Since this and other such attacks are based on analyzing very small changes in the target system clock, even a tiny amount of randomization or pseudo randomization would be effective.

Although it would certainly make it more difficult, it would not be an absolute defense against the identification of the PC. Identification of a PC that is using this defense may not occur in 30 seconds after a single challenge, but could

I miss read the title the first time, the joke being I do heat my office with computers. I have three of them in the room and the 4800 dual core puts out a fair amount of heat on it's own keeping it toasty compared to the rest of the house. I used to have a dual 300 that got so hot you couldn't touch the side of the case. I literally put a box fan on that one to keep it running.

I have a PDP-11/73 [kicks-ass.net] that warms up the workshop quite nicely. I don't know why, but despite drawing around 400W (just the same as my PC) it throws out a lot more heat. Of course if you fire up the big RL02s it gets a lot noisier and the current draw goes up. Just the PDP-11 on its own is quieter than my PC, too, despite having four 5" fans.

I miss read the title the first time, the joke being I do heat my office with computers. I have three of them in the room and the 4800 dual core puts out a fair amount of heat on it's own keeping it toasty compared to the rest of the house....

I did the same "back in the day" when I got my first personal Unix box - an Altos 68000 - one of crowd of generic Motorols 60x0 unix boxes that came out before PCs squeezed them out. With a meg of RAM and an 8" hard drive it put out enough heat to keep the computer r

Ok, so if I am using Tor, presumably I've got clients behind these servers.... so according to the article, he can detect a server? What good does that do him? That doesn't identify *MY* machine the client which is actually doing the browsing. So, he can see which server is running Tor... couldn't he just portscan to find that out?

Hidden services are something different than a Tor user. A hidden server is reachable via some hostname in the.onion TLD and provides services like HTTP, just like in the non-Tor network. It's basically an anonymous server instead of an anonymous client.

Obviously someone who is unaware of the millions of machines that are routinely overheated by overload... most machines running graphics intensive applications and all machines running BOINC.
Bad thinking, but wishful thinking often is.
Davis

1. Create a minor botnet2. DDoS a server, not enough to kill it but slow it down a lot3. Measure response times to hidden service4. If all requests using different paths now are slow, you got itAlso, that attack scales to detect multiple hidden sites simultaniously - hit one server, request ten sites and see who answers quickly and not. It's just a consequence of depending on one machine. The only way you could totally avoid that is to not have services at all, only distributed datastore like e.g. Freenet.

Since date and time information isn't included in TCP/IP packets, this kind of attack won't work for all services. Assuming that the "hidden servers" in question are HTTP servers, there is a rather simple workaround: simply disable sending the "Date" header. This can probably be accomplished with mod_headers [apache.org] in Apache, but I've never tried using it myself. Oddly enough, the server would still be standards compliant [w3.org]. Obviously, servers that leak the current time by some other means would still be vulnerable.

A simpler, less precise attack of this nature would simply be to continuously ping the suspected server via both Tor and the public internet. If they (reproducibly) fail at the same time (and we could launch a denial-of-service attack to make it fail), they're probably the same machine. Attacks of this nature might even be able to confirm if a hidden server is on the same network as another computer.... But any of these attacks require someone to suspect you of running the server in the first place—and if they do, you probably have bigger problems to worry about.

The bottom line is, as Tor's manual clearly indicates [eff.org], having a hidden server machine accessible from both Tor and the internet is a bad thing. Operators of hidden services should use a dedicated machine and block all incoming traffic (on all TCP and UDP ports) that is not via Tor.

Actually, it is [ietf.org], and this what I mainly use, but initial sequence numbers also incorporate a timer. If both are unavailable, the link between packet emission and timer interrupts will still show up the clock skew.

If you leave a process running in the background consuming 100% of your cpu all the time, like setiathome or distributed.net, then your system won't get hotter, rather it will just be processing something else to load the cpu and still generating the same amount of heat.

What if there were a time sync server in the setup whose whole purpose in life is to keep track of the time?

Have no other apps running on it, so that it has negligible system load. All the other systems in the TOR could be set up to sync their time with it every few seconds, i.e. before clock drift becomes detectable. Might check each and every second so as to intentionally cause a collision on the time server and add some randomness. Or, do a time sync every random(1..10) seconds. Or, use multiple NIC

This theoretical attack is based on using (previously covered on/.) clock skew to identify systems.

The correct defense is the same as the last time:

a) Make sure that there is no system clock skew, by running Network Time Protocol (NTP) on all servers.

b) Make sure that all externally visible timestamps are based on the system clock.

Part (b) is the only difficult step, since many current IP stacks use a private counter/clock instead of the system clock, presumably to reduce the overhead of providing timestamps. I know that Linus T have discussed using user-level library code to provide microsecond resolution (or better) timestamps, with very low overhead:

The library code can just query the cpu/system timer, multiply by the current scale factor (which depends on things like dynamically variable cpu clock frequency), and add the base time which was stored by the OS on the last HW clock interrupt: Total runtime, including call/return overhead can be below 100 clock cycles, which is fast enough to use it everywhere timestamps are needed:

BTW, I wrote asm code to do exactly this inside Novell's NetWare OS a little over 10 years ago. In NetWare these timestamps were used by the Packet Burst algorithms which optimized packet transmission rates.

What i haven't seen mentioned yet is:Won't this break down if more than one investigator is running this attack on a network? What if several people try this trick against a group of servers? How would they know the time skew was due to THEIR query? What if this is the best trick ever so everyone trying to track down a computer uses it;)

Couldn't they detect whatever the popular trick is to increase temp and have the computer try and skew others on the network. I don't suppose you would want to do it random

Not one. You have to know a finite set of computers that are a Tor network. In my reading of the article it seems that without this finite set you fall victim to the 16 per 1000 that have the same skew, problem.

Without knowing as well that all systems are skewd differently you also have a problem. What if you grabbed a random set of 32, with 2 groups of 12 and one of 8 with identical skews.

And you must have missed the part that it's not the timestamp he measures, but the change in timestamp over a period of time that correlates to what he has the remote server do. That's a lot more telling.

Not at all, people are making too many assumptions about what is not written. All it says is that he tests the skew caused by heating up the crystal which takes several hours to do. It says nothing about testing the skew while the system is "idle" because in reality there's no way for him to know if the system is actually idle or not. His system is all about making sure there is a load and then testing the skew while it's hot.

Not that I think this sort of thing is really going to become anything more than an interesting proof-of-concept anytime soon, but couldn't you combat this by having a local NTP server for your server farm, and then setting the servers to update from that server at frequent intervals (say every 5 sec or so)? It would waste cycles on the machines and generate some extra load on the network, but it would keep the clocks from ever drifting far, and it would narrow the window in which you'd be able to detect drift to something pretty small.

couldn't you combat this by having a local NTP server for your server farm, and then setting the servers to update from that server at frequent intervals (say every 5 sec or so)? It would waste cycles on the machines and generate some extra load on the network, but it would keep the clocks from ever drifting far, and it would narrow the window in which you'd be able to detect drift to something pretty small.

Wouldn't this create a possible vulnerability where a malicious host could forge a packet containing wrong time and having all other computers updating the clock to that time? or are the hosts somehow able to validate the packets?

I agree it wouldn't do much in terms of dammage but i would guess it would make it possible to bypass a few time restricted activities, logging, etc.

The article is very low on information on how he proposes to locate a computer. Yes clock skew would help, but you need to locate the machine somehow. And on top of that he thinks that more traffic equals higher load on the cpu. This isn't necessarily true, in a closed environment you might be able to do it, but on a global scale I can't see how this would help you unless you got global knowledge of the network, and if you do, sybil [google.com] attack is a lot easier to do.

One must remember TOR doesn't guarantee strong anonymity, for that you need something like Herbivore [cornell.edu].

Exactly. This is kind of like the whole NP-Complete space. It's hard to find the right answer, but once you've found the right answer, it can be verified in polynomial time. Same thing here. It's a verification exploit, not a location exploit. It can, with a sufficiently large number of tests, verify that the host you think is providing the information really is. However, unless you can simultaneously track the heat emissions from every computer in the world (and somehow process that much information

From which we learn that: The system consists of approximately 27,000 lines of Java and C code, 2,000 of which comprise the GUI for anonymous filesharing and a helper application for k-anonymous chat while the rest form the core system. (Section 5: Performance)

Theres a few reasons why this is so. First of all, this was as far as I know a proof of concept. Secondly in order to send one bit, the you need to transmit 2k+1 bits (from memory, should be in same section as the one you mentioned), that is very expensive. And due to the protocol it's very easy to do some DoS attacks on a clique, granted you can try to switch to a new clique, but you can spend a lot of time trying to get your message sent.

On the other hand, Tor can be used by simply configuring the users application to use a known Tor entry point as a proxy server. This configuration can be removed when the user is done, leaving little or no tracks. In this way, Tor can be used by any system that supports TCP/IP and SSL.

This is slightly offtopic, but I didn't realize that you could use the TOR network in this way. Can you expand on this? I thought in order to use TOR, you had to install the TOR software package on the end-user's machine, and

That was the point of our project, to make it easier to use. With a linux router running iptables and our software all TCP/IP will be captured and transmitted on TOR (Or whatever subset of TCP/IP communication you want it to grab)

Or, just have the Tor server toss a little random data between two ports from time to time, as is done to secure other hardware where this technique has been tried in the past (like smartcards). Or just run WoW in the foreground;)

You measure clock skew before, during, and after you hit the hidden service. If the change in clock skew happens at the same time you load the server, that indicates that it's probably the correct server.

Read what you just said. Skew is a distortion of measurement. In normal operation there is no distortion, only when the crystal is heated. So by definition there is only one possible value for the skew and it's the change from before to after the crystal has been heated.

Sure there's clock skew normally. I know that my computer doesn't have a caesium-133 atom inside of it. As such, the clock is inaccurate and bound to vary relative to the correct time. I have noticed that it has been up to a couple of minutes off. Right now, as I updated it from an NTP server, it was 4 seconds off. It has to become inaccurate to have that problem.

You really should've read TFA in this case. Apparently, heating up the box causes fluctuations in system time which this chap claims to be able to detect in a meaningful way. There's more to it - interesting read.

then, consider the fact that you found "You must be new here" a novel response - at least novel enough for you to use it. let me just say, *You* must be new here.:P

P.S. i hope the recursive irony - including my ID and the parent posters ID - is self evident. no need for recursive "*You* must be new here" replies. please think of the children.

P.P.S. i don't really think recursion is the right word. but the fact that an 'older' user is declared 'new' by a newer user on each child post should lead to a division by zero, a black hole, or at least a bazzarro world somewhere... or it might just be my bed time.

P.P.S. i don't really think recursion is the right word. but the fact that an 'older' user is declared 'new' by a newer user on each child post should lead to a division by zero, a black hole, or at least a bazzarro world somewhere... or it might just be my bed time.

I'll take issue with your usage of the word "older"; I'll have you know that, at a measly 23 years old, I'm probably younger than/. users with a higher UID number.

And I'm too tired to really care that I really don't need to get involved in another log(UID)-based pissing match. (But hey, isn't that what posting on Slashdot at 2:30AM is all about? Besides, I already made a constructive comment over in the article about embedding DB authentication credentials on software.)

His software lets you pinpoint servers in the anon TOR network, good trick, but ultimately useless (since its the users computer you are trying to find).

Of course the other problem is "giving it a heavy load" define heavy load? is it just a little more than usual? or does it mean you have to heat board (he goes off system clock, maintained by a frequency crystal on the MB), most data centres I would think would be fairly efficient at routing even high heat loads out of enclosures and away from the machine.

And then, whoever he does this to can sue him for DoSing their machine, if they can prove (and its not overly difficult) that heat damages computer parts, he can be nabbed for wilful destruction of property as well, since his whole exercise heats the machine for no other reason than locating it.

Then of course, the only way to "heat up" said computer is to do it through the TOR api, which i am guessing most anon servers are built to handle very well (since that would be their primary task).

Oh, and this of course neglects to take into account that your TOR requests may be handled by many many servers in a cluster, each one heating and skewing at different rates...

Ok, its late on a Saturday afternoon and I can poke that many holes in his trick (even if only one is at all real), gimme a good 2-3 hours with some energy drinks in me and I can find more I am sure ^_^

If he can prove it works (and successfully do something usefull with it) in the real world, then it would be a better story.

I picture this attack being used as part of an ongoing investigation. They have a target and they just need some pattern analysis to secure the warrant. Over a month-long investigation, they could glean a lot of info by throwing up very specific requests and seeing if your hard drive springs to life or your CPU spikes.

In most cases, the wouldn't even need to be near your house. A well-positioned amp-meter with remote sensing could tell you if the CPU suddenly needed more power.

There are CPU frequency-shifting programs (for Linux: cpufreq) that allow the computer's user to change the CPU's frequency to his/her liking...One could easily set the frequency lower than the original maximum, so that spikes can't be detected.

Add to the above approach, keeping the clock in sync, as others have noted.

A well-positioned amp-meter with remote sensing could tell you if the CPU suddenly needed more power.

Somehow I don't think that would meet the standard for evidence...

You need to measure tiny variations in current caused by one device, mixed in with the haystack of all the other electric devices in your house... Most of which can vary significantly from moment to moment.