Posted
by
Soulskill
on Tuesday October 23, 2012 @04:12PM
from the throwing-textbooks-at-each-other-is-high-throughput dept.

MrSeb writes "A team of researchers from MIT, Caltech, Harvard, and other universities in Europe, have devised a way of boosting the performance of wireless networks by up to 10 times — without increasing transmission power, adding more base stations, or using more wireless spectrum. The researchers' creation, coded TCP, is a novel way of transmitting data so that lost packets don't result in higher latency or re-sent data. With coded TCP, blocks of packets are clumped together and then transformed into algebraic equations (PDF) that describe the packets. If part of the message is lost, the receiver can solve the equation to derive the missing data. The process of solving the equations is simple and linear, meaning it doesn't require much processing on behalf of the router/smartphone/laptop. In testing, the coded TCP resulted in some dramatic improvements. MIT found that campus WiFi (2% packet loss) jumped from 1Mbps to 16Mbps. On a fast-moving train (5% packet loss), the connection speed jumped from 0.5Mbps to 13.5Mbps. Moving forward, coded TCP is expected to have huge repercussions on the performance of LTE and WiFi networks — and the technology has already been commercially licensed to several hardware makers."

By definition, UDP sessions don't have delivery garantees like TCP does. That's what TCP does! It provides a mechanism for clients to ensure integrity and ordering of received packets. Netbios is an encapsulated protocol over IP, which uses TCP to ensure delivery. ICMP... really? Are you really asking for delivery correction on multi packet ICMP? For real? You do realize that fragmented ICMP is a nono, right, and that ICMP should be wholly contained in single packet messages?

While true that I wouldn't work for UDP, the clearing of traffic normally consumed by TCP requests and responses would improve performance of UDP by making the medium more available even if the coded TCP method has no direct implication with UDP. (It is up to the UDP session members to negotiate and handle lost datagrams. Not the network stack. UDP is intended for custom user protocols that can't easily live inside a TCP/IP packet, like large video or streaming audio feeds. Normally these protocols can deal with loss, and the burden of ensuring 100% delivery comes at prohibitive performance costs, so UDP with acceptible loss is ideal.)

How is it not compression? It reduces the data size being transferred and is recoverable on the other end.

No, it slightly increases the data size being transferred, thus allowing it to be recoverable on the other end if there are minor losses.

Here's an example of how it might work. Say you have a packet that holds 1024 bytes of payload data, plus a few extra for overhead. (Probably not realistic, but this is just to lay out the principles involved.) Now, you could send all 1024 bytes as straight data, but then if even 1 bit is wrong, the whole packet must be re-sent, adding latency. Instead, you send (say) only 896 bytes of actual data, and 128 bytes of recovery data. You break up the data into 64-byte blocks. Thus you have 14 blocks of actual data. The other 2 blocks consist of recovery data, generated by some sort of mathematical equation too complicated to describe here (and which frankly I don't understand myself). Here's the trick: on the receiving end, any 14 of the 16 blocks is enough to recover the whole 896-byte original datagram. Doesn't matter which 2 blocks are bad, as long as no more than 2 are bad, you can recover the whole thing.

This could be useful in an environment where packet loss is very high. A similar method is currently used when transmitting large binary files on Usenet, since many Usenet servers do not have 100% propagation and/or retention.

How is it not compression? It reduces the data size being transferred and is recoverable on the other end. Maybe I'm not an expert, but isn't that _exactly_ the definition of compression?

It doesn't make it smaller - in fact, it will make the data larger. It gives improved performance because of the way TCP responds to dropped packets:

(1) Normally the receiver has to notice the dropped packet, notify the sender, and wait for the packet to be retransmitted - meaning that the data in question (and any data after it in the stream) is delayed by at least one round-trip. With this scheme, there is enough redundancy in the data that the receiver can reconstruct the missing data provided not too much is lost, improving the latency.

(2)TCP responds to packet loss by assuming that it is an indication of link congestion, and slowing down transmission. With wired links, this is a good assumption, and results in TCP using the full bandwidth of the link fairly smoothly. With wireless links, however, you can get loss due to random interference, and so TCP will often end up going slower than it needs to as a result. The error correction allows this to be avoided too.

That's kinda the point. Crappy signal results in high packet loss. If you can recover lost packets through some recipient-side magic (clever math, apparently) rather than retransmission, you avoid the overhead of another roundtrip, and get higher bandwidth as a result. This cuts down massively on latency (huge win) and should also decrease network congestion significantly.

I'm trying to think of a way to put this in the requisite car analogy, but don't honestly know enough about the lower-level parts of the network stack to do so without sounding like an idiot to someone sufficiently informed. But I'm sure there's something about a car exploding and all traffic backs up for tens of miles along the way;)

This is not plain FEC for point-to-point communication. This is based on network coding, e.g., see http://en.wikipedia.org/wiki/Linear_network_coding [wikipedia.org] and how it can increase the capacity in the butterfly network over traditional packet routing schemes, counter to our intuition for flow networks/water pipes.

Network coding has been a fairly hot research topic in information theory and coding theory over last few years. But it is fairly revolutionary in my opinion. It is still early days in terms of practical coding schemes and implementations.

This might actually hurt them then because they charge by what was transmitted, not by what was received.

Yeah, but you have to consider how they do math. You assume they calculate profit based on usage and rates. The reality is they calculate the rate based on the desired profit and usage. So when you use less data (fewer retransmits) they will just charge you more for the bits that get through.

This is not simple data compression or error control coding. This is network coding, e.g., see http://en.wikipedia.org/wiki/Linear_network_coding [wikipedia.org] [wikipedia.org] and how it can increase the capacity in the butterfly network over traditional packet routing schemes, counter to our intuition for flow networks/water pipes.

It is a fairly hot research topic that has been around for last few years. But it is fairly revolutionary. It is still early days in terms of practical coding schemes and implementations.

There is no compression. Its RX error correction. This, seemingly, will reduce latency and increase effective throughput because you are now spending less time in a RX/TX retransmit cycle or a TX-TO retransmit cycle. As such, it allows more time for TCP window scaling to open up, even in the face of lost packets. In turn, a larger window means higher throughput with less protocol overhead.

Don't be so sure. Lots of people make tons of money in this country without doing any math. Look at contractors (plumbers, HVAC repair, etc.); they make really good money without any real education and without having to do any math. They probably make out better than most engineers. Or look at Mitt Romney: you think he's ever had to do any math in his business? Heck, go all the way back to Thomas Edison: that guy didn't understand math either (that's why he hated AC power). And he got filthy rich basically by using a brute-force method.

I work in Sattelite Communications and we use these algorithims by default. We call it forward error correction. I would assume that they would use a much less aggressive algorithim, maby 1 correction bit per 8-16 bits rather than a 1 for 1 data, or a 1 for 2 data that you see for dirty links in the space operations. See http://en.wikipedia.org/wiki/Viterbi_algorithm [wikipedia.org] or http://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction [wikipedia.org].
The additional throughput is not from compression but from not having to TCP resend entire packets and probably also prevents the TCP window from resetting.

Most of the posters here seem to misunderstand what this is; it is NOT just a simple Forward Error Correction scheme.

TCP has two design decisions (as pointed out by others) that are totally wrong for a WiFi environment where random packet loss is a normal and expected operating condition:

1. A lost packet means a collision or network congestion, therefore the proper response to a lost packet notification is to slow down the transmit rate

2. When packet #2 is lost, even though the client has perfectly cromulent copies of packets 3-1094932, they must *all* be thrown away and the entire stream is restarted. There is no facility to go back and retransmit just the missing packet - the ACK can't say "I got packets 1,3-1094932, please just re-send #2".

This new scheme reconstructs packet #2 in this scenario by using the FEC data in the other packets. This allows the link to tolerate a certain % of packet loss without requiring any re-transmits, thus all those packets from 3 upwards don't have to be retransmitted. It also greatly reduces latency as reconstructing packet #2 is faster (due to the computationally efficient FEC algorithm) than requesting a retransmit. This also prevents the TCP link from scaling back its transmit rate, further improving performance.

It's definitely clever. One of the downsides of relying on older technology (TCP in this case) is when it makes fundamental assumptions that are completely, horribly wrong for new applications (WiFi).

To those who ask why not just do this at the link layer? Because then you are wasting the effort on protocols like UDP, etc that may not want or need this kind of correction. It may also introduce delays that are unacceptable for certain applications (like VoIP). A 50ms delay is great to avoid degrading your file transfer from 10mbit to 0.5mbit, but is completely useless during a VoIP call or a multi-player FPS. Personally I'd like to see this kind of tech rolled into the next version of TCP to make it an official part of the standard... then again I'd like to see the standard MTU size increased given the ubiquity of gigabit ethernet these days, but that still hasn't happened as far as I know, due to incompatibilities, interop issues, etc.

The problem is that TCP's congestion control [wikipedia.org] is very poorly designed for wireless communications. TCP is designed with the mindset that if everything is working well, there is no packet loss and any packet loss is a sign of congestion or other abnormal network patterns. For wireless links, this is simply untrue, so TCP is way over conservative in how many packets to send over wireless links. The article is about using error correction to fix that.

Yes, Raptor Codes are the specific ones used by NASA on their newest deep space missions link [arxiv.org]. Raptor codes are a type of fountain code [wikipedia.org].

Fountain codes are worth looking at if you haven't been keeping up with the latest and greatest Comp Sci developments in the last 15 years. With fountain codes you can break up a chunk of data into any number of unique packets. Any subset of those packets that add up to the size of the original packet can be used to reform the original file.

So say i had a 1MB file to send to Mars. I run a fountain encoder on that and i tell the encoder to make me 10,000,000 packets 1KB in size out of that 1MB file. So the fountain coder gives me 10GB of generated packets from that 1MB file. Now i spam those 10,000,000 packets across a really noisy connection. As soon as the receiver successfully receives any 1,001 of those packets (totalling up to just over 1MB worth of received packets) it can rebuild that file. I don't need to wait for retransmission requests or anything and it doesn't matter what packets make it or not. Just as long as the receiver can eventually successfully receive slightly over X bytes of data it can rebuild an X byte file.

Traditional error correction codes are great for correcting bit errors in a message that mostly gets there. Fountain codes on the other hand are great in situations where entire chunks of the desired message can be lost, they can avoid the retransmission of these lost packets entirely. The only issue is that they require redundancy in transmission in the first place.

It seems here they are grouping 50 packets of data together into 1 lot and making 50+R coded packets out of it where R is some number that's variable depending on how much loss they see. So they might send 60 coded packets. If any 50 of the 60 coded packets make it there they should have enough to rebuild the original 50 packets using fountain codes.