Wednesday, October 01, 2008

TCP DoS (probably) real

This post by RSnake describing a dangerous new DoS attack is probably real.

These guys haven't published their tool, so there is no independent confirmation of their results. Therefore, my immediate reaction is skepticism: things like this tend to be hype. However, after listening to their audio interview, I believe they are probably right. They have been working deep withing TCP stacks. If such problems exist, then they would have certainly come across them.

The problem, in a nutshell, is that they can open a TCP connection that will never be closed. The only way to get rid of them is to reboot the server. This means that I can connect to the Internet with a dialup connection, then quickly take down www.google.com (or any other server) by maxing out the number of connections.

They describe one mechanism. A TCP stack tries to figure out the maximum speed of your connection, in order to slow down data transmission so that packets won't be dropped. One technique they describe is to behave as if their connection were getting slower and slower to the point that the TCP stack is tricked into believing it will take years to complete the transmission of data. This forces the TCP stack to keep trying for years to send just a few bytes.

How do we fix these problems? The problem is a resource-leak like the more common memory-leak. Back when I worked on the TCP stack for the Proventia IPS, we designed the code and test cases to deal with exactly this sort of resource-leak. The trick is to create billions of connections, with special tools like this, then verify that once everything is gone, that you indeed have gone back to zero resources.

EDIT: When I say "leaving the connection open", I refer to how we see the connection from the point of view of kernel resources. The socket, though, is probably closed. That's the essence of this problem: closing the socket doesn't necessarily free the connection resources if the connection is in a certain state. This makes the DoS different from things like the "LaBrea Tar Pit" that DoSes applications by forcing them to leave the socket open.

11 comments:

I recall this technique you've described from the late 90s, for use in honeypots. Only it was used then to prevent malicious software from attacking other machines, because it would be tied up, glued to one or more honeypots.

Not quite, though I think I discovered them both around the same time. Perhaps it was a variant of La Brea, because I remember discussion about adapting it to use a similar MAC address collection and spoofing idea. Instead of just a SYN/ACK followed by silence, it monkeyed with slow start to trap connections for much longer. The details escape me, right now.

One user on dial-up taking down Google is a bit extreme, but take a thousand users on broadband and it sounds like you can make some real impact, and maybe hit yahoo! and other large sites at the same time. I need to listen to the podcast again; I missed mention of tricking the system to keep that connection open, but that makes a lot of sense.

EDIT: When I say "leaving the connection open", I refer to how we see the connection from the point of view of kernel resources. The socket, though, is probably closed. That's the essence of this problem: closing the socket doesn't necessarily free the connection resources if the connection is in a certain state. This makes the DoS different from things like the "LaBrea Tar Pit" that DoSes applications by forcing them to leave the socket open.

>"When I say "leaving the connection open", I refer to how we see the connection from the point of view of kernel resources. The socket, though, is probably closed. That's the essence of this problem: closing the socket doesn't necessarily free the connection resources if the connection is in a certain state."

Please. We're all grown-ups here. You might as well say "2MSL" since we all know it's what you mean.

It's 2MSL in the TIME_WAIT state. However, there are lots of other states between ESTABLISHED and CLOSED for which there is no specific duration.

The situation is complicated by new features, such as "selective ACK". What happens if I selectively ACK your FIN, but not the byte of data before it?

In the TCP state table, every state leads to CLOSE -- but there is no line for how it gets there. Depending upon the implementation, it may leave some resources hanging (such as timers, mutexes, and memory).

As it looks, devices in the middle [FW, IPS, sniffers and what have you] might be the biggest problem as they often try to track 'state' over connections they do not control and most likely do this for many servers.

It seems that we have to do with an old TCP vulnerability; i have suffered that kind of attack about one year ago: http://www.bsdforums.org/forums/showthread.php?t=49708All linux kernels (and standard iptables) are still affected.

Not sure whether Robert E. Lee and Jack C. Louis have "discovered" the same TCP vulnerability; therefore awaiting technical details regarding this issue.

In any case: a good network packet filter (i use OpenBSD's PF) solves the problem described in my post.