CoDel knows the difference between good and bad queues—and how to handle them.

Bufferbloats—an amusing word for a less-than-delightful issue (we reported on these last year). A bufferbloat is the proliferation of more and more buffering of IP packets in routers, switches, modems, and other devices that connect both users to the Internet and different parts of the Internet together. Having packets stuck in buffers unnecessarily means communication slows down. Van Jacobson, creator of the TCP congestion control mechanisms, and Kathleen Nichols now propose a general purpose solution to the bufferbloat problem.

In their paper, Nichols and Jacobson point out the difference between a good and bad queue. A good queue consists of a number of buffered packets that come in faster than the device in question can send them on their way. A good queue quickly drains as packets are transmitted. Such queues are necessary to smooth over the inherent bursts of arriving packets at bottlenecks in the network. Such bottlenecks can be between a fast LAN and a slower Internet connection, between a 1Gbps Ethernet link and a 100Mbps one, between a wired and a wireless network, and so on. But without buffering, burst can't be accommodated, which makes it impossible to make use of the full capacity of a given network link.

A bad queue, on the other hand, is filled up at the same rate as packets are transmitted, so the queue never empties. In its steady state, the TCP protocol will release more packets as the reception of earlier packets is acknowledged. At this point, a queue at the bottleneck between the sender and the receiver will become a standing queue—filling up and draining at the same rate to stay the same size.

Suppose the bottleneck is 1Mbps, and the standing queue is ten packets. At 1Mbps, it takes 120 milliseconds to transmit those ten packets. Packets from other communication sessions now have to join the queue and wait for the ten packets that were already there to be transmitted, so now all communication is delayed by 0.12 seconds. For many types of communication, that's not a big deal, but it's not ideal for real-time communication such as VoIP or video conferencing, and can be deadly for many types of online games.

In an earlier rant on queues, Jacobson states that many researchers in the field fail to understand the problem. They assume that packets arrive randomly according to a Poisson distribution. However, TCP, which is used for more than 80 percent of the Internet's traffic, sends its packets anything but randomly. Instead, there is a control loop between the sender and the receiver. Based on this insight, Nichols and Jacobson came up with CoDel, Controlled Delay Management.

Unlike other active queue management (AQM) algorithms, CoDel is not interested in the sizes of the queues. Instead, it measures how long packets are buffered. Specifically, CoDel looks at the minimum time that packets spend in the queue. The maximum time relates to good queues which resolve quickly, but if the minimum is high, this means that all packets are delayed and a bad queue has built up. If the minimum is above a certain threshold—the authors propose 5 milliseconds—for some time, CoDel will drop (discard) packets. This is a signal to TCP to reduce its transmission rate. If the minimum delay experienced by buffered packets is 5ms or less, CoDel does nothing.

In simulation results, CoDel compares favorably to the common RED (random early detect/discard) AQM algorithm in most regards. However, RED requires careful tuning of parameters to work well, while CoDel works within a broad range of bandwidths without the need to change any settings. It also doesn't matter how much buffer space is available—CoDel always kicks in when standing queues exceed five milliseconds. In addition, CoDel has some implementation advantages over other AQMs as it does nearly all of its work at the dequeue stage (when packets are transmitted). CoDel does require adding a timestamp to each individual packet as it is received, but even if the network hardware can't do this, timing information is directly available from a CPU register in modern CPUs. This requirement shouldn't be problematic.

The authors write that the algorithm is now mature enough for experimentation, and not just in open source IP router implementations. "Things would probably go fastest if we had some interested party who wanted to apply it, for example in the cable data edge network," Nichols told Ars. "We've had a 'do no harm' but reduce the standing queue philosophy in our testing. In particular, we were trying to keep the utilization high while reducing the persistent queue delay and not introducing any additional unfairness or problems beyond those that might already occur in a TCP-based network with a variety of flows. That will be what we are looking for in our testing."

Nichols believes that bufferbloat currently impacts the consumer edge the most. "This means the residential access networks, Wi-Fi hot spots, mobile device access networks. Van has seen data on the latter that convinces him that what is called congestion in the cellular network is actually bufferbloat."

The code that implements CoDel is available under a dual BSD/GPL license, so we may see the algorithm adopted in the foreseeable future.

Iljitsch van Beijnum
Iljitsch is a contributing writer at Ars Technica, where he contributes articles about network protocols as well as Apple topics. He is currently finishing his Ph.D work at the telematics department at Universidad Carlos III de Madrid (UC3M) in Spain. Emaililjitsch.vanbeijnum@arstechnica.com//Twitter@iljitsch

I disagree - a delay of 120ms is a disaster in any kind of communication except bulk. Even in loading a webpage that is far, far too long. It might border on acceptable if it's the total time for the entire page, but it'd be the time for just one package so that's going to be a disaster. It's about an order of magnitude too long to be called "no big deal for many types of communication" - the only thing you could still comfortably do is transfer big files.

Fitten, it's on the ACM portal, so you'll need either a subscription to the ACM or be part of an institutional subscription, such as a university or large tech company.

Anything that improves on RED in any way is definitely appreciated. RED is such a pain to deal with, as every router you configure it on has to be tuned individually. From a skim of the paper, CoDel seems to be quite flexible in terms of routers, allowing rapid deployment.

It would definitely be interesting to see more of the data that was used to motivate this work, though.

The round table discussion linked below is a little less technical and arguably more interesting but I'm not sure if you all can read it. That said if you're in IT you really should have an ACM membership.

The round table discussion linked below is a little less technical and arguably more interesting but I'm not sure if you all can read it. That said if you're in IT you really should have an ACM membership.

I wasn't talking about the link in the article itself. I was talking about the link in the first post of this thread. The link in the first post generally takes you to the Ars article (the full story).

Hopefully we'll see this quickly implemented in dd-wrt/Tomato/open-wrt etc. It seems pretty simple to implement and shouldn't have a huge overhead so I'm hopeful. It would be great to not have to worry about my connection performing horribly because I've saturated my upload with a bit torrent client or cloud back up.

On topic, what's it take for something like this to actually impact me as a consumer? Does every router and device between me and my skype buddy have to implement some sort of AQM?

No. Just getting a good AQM at your router would make a big difference. I don't think the article did a very good job of explaining whats going on so let me try.

Basically BufferBloat (BB) is a problem created by the intersection of the way TCP works and increased RAM buffers in networking gear. The algorithms that make up TCP are designed to do two things. (1) Give all connections that share a link an equal proportion of available resources and (2) scale the speed of two way transmission to match the slowest link on a communication path between to communicating network nodes. Both of these goals are important. In the first case you don't want one TCP stream to be able to monopolize your internet connection. The second goal is important because you don't want to waste bandwidth and overload routers. For example if you had a very fast network connection but between you and whoever you are talking to there is a slow link. You don't want to constantly blast traffic at your max speed at the router connected to that slow link. The packets will just get dropped at the overloaded router servicing the slow link so it is just wasting bandwidth.

So the way TCP accomplishes this is by starting to transmit data slowly but with each received acknowledgment from the other end you exponentially increase the data you transmit. TCP keeps exponentially increasing the data it transmits until a packet is dropped. This is a feedback signal to TCP that it has saturated its fair share of the available network path between it and the other end of the connection. At which point TCP backs off. Now the important thing as far as buffer bloat is concerned is that TCP needs to detect that packet drop as soon as possible. If it does not then it will just keep ramping up its transmission speed.

This is where buffer bloat comes in to play. Networking gear makers started putting lots of cheap ram in routers, switches, and even your NIC card or its driver. The idea being that "packet loss is bad". This is a faulty assumption. For TCP to properly do its thing it needs to be notified of packet loss as soon as possible. In a common BB scenario a router should drop packets as quickly as possible once it starts getting congested. So that the TCP streams that are traversing it will back off and stop sending so much data. But with long queues many networking equipment will keep those packets in the queue for a long time and only drop packets when the buffer is full. Well this extra time the packets spend waiting in a buffer is time where all of those TCP streams that are traversing that router just keep ramping up transmission speed. Sending more and more data. Until they eventually overwhelm the router.

So to sum up. buffer bloat is the problem of letting TCP ramp up its transmission speed far to high because it its not getting timely notices that packets were dropped.

The reason this came in to existence is that buffers used to be expensive. The queuing policy was just drop packets when the tiny little buffer is full. Well ram got cheap. So the buffers grew and so did the time packets sit waiting in the buffer. This delay broke TCP's ability to self govern its transmission rate.

What AQM is all about is deciding the best way to drop packets before the buffer is even full. And this is important when ever there is congestion but especially when you have asymmetry in speed. IOW the standard home internet connection where you have a fast LAN connecting to a slow internet link.

The benefits of having proper AQM in your home router are nearly instantly noticable if you are maxing out your bandwidth. It makes a huge difference.

This is a very nice write-up of this research. The original article is pretty easy to parse, too.

I'd just like to point out that this article should really cite bufferbloat.net and the CeroWRT project, since it was the active interest from those groups that spurred these researchers to find a robust solution. I know that the lack of related links is a current hiccup with the new website design.

On topic, what's it take for something like this to actually impact me as a consumer? Does every router and device between me and my skype buddy have to implement some sort of AQM?

No. Just getting a good AQM at your router would make a big difference. I don't think the article did a very good job of explaining whats going on so let me try.

Basically BufferBloat (BB) is a problem created by the intersection of the way TCP works and increased RAM buffers in networking gear. The algorithms that make up TCP are designed to do two things. (1) Give all connections that share a link an equal proportion of available resources and (2) scale the speed of two way transmission to match the slowest link on a communication path between to communicating network nodes. Both of these goals are important. In the first case you don't want one TCP stream to be able to monopolize your internet connection. The second goal is important because you don't want to waste bandwidth and overload routers. For example if you had a very fast network connection but between you and whoever you are talking to there is a slow link. You don't want to constantly blast traffic at your max speed at the router connected to that slow link. The packets will just get dropped at the overloaded router servicing the slow link so it is just wasting bandwidth.

So the way TCP accomplishes this is by starting to transmit data slowly but with each received acknowledgment from the other end you exponentially increase the data you transmit. TCP keeps exponentially increasing the data it transmits until a packet is dropped. This is a feedback signal to TCP that it has saturated its fair share of the available network path between it and the other end of the connection. At which point TCP backs off. Now the important thing as far as buffer bloat is concerned is that TCP needs to detect that packet drop as soon as possible. If it does not then it will just keep ramping up its transmission speed.

This is where buffer bloat comes in to play. Networking gear makers started putting lots of cheap ram in routers, switches, and even your NIC card or its driver. The idea being that "packet loss is bad". This is a faulty assumption. For TCP to properly do its thing it needs to be notified of packet loss as soon as possible. In a common BB scenario a router should drop packets as quickly as possible once it starts getting congested. So that the TCP streams that are traversing it will back off and stop sending so much data. But with long queues many networking equipment will keep those packets in the queue for a long time and only drop packets when the buffer is full. Well this extra time the packets spend waiting in a buffer is time where all of those TCP streams that are traversing that router just keep ramping up transmission speed. Sending more and more data. Until they eventually overwhelm the router.

So to sum up. buffer bloat is the problem of letting TCP ramp up its transmission speed far to high because it its not getting timely notices that packets were dropped.

The reason this came in to existence is that buffers used to be expensive. The queuing policy was just drop packets when the tiny little buffer is full. Well ram got cheap. So the buffers grew and so did the time packets sit waiting in the buffer. This delay broke TCP's ability to self govern its transmission rate.

What AQM is all about is deciding the best way to drop packets before the buffer is even full. And this is important when ever there is congestion but especially when you have asymmetry in speed. IOW the standard home internet connection where you have a fast LAN connecting to a slow internet link.

The benefits of having proper AQM in your home router are nearly instantly noticable if you are maxing out your bandwidth. It makes a huge difference.

The round table discussion linked below is a little less technical and arguably more interesting but I'm not sure if you all can read it. That said if you're in IT you really should have an ACM membership.

I disagree - a delay of 120ms is a disaster in any kind of communication except bulk. Even in loading a webpage that is far, far too long. It might border on acceptable if it's the total time for the entire page, but it'd be the time for just one package so that's going to be a disaster. It's about an order of magnitude too long to be called "no big deal for many types of communication" - the only thing you could still comfortably do is transfer big files.

If the system works, you should only start to see that kind of delay when the router is getting swamped.

Is data integrity lost with these 'dropped packets ? Mean with certain data types as with compression,the precision of the data is not that important. But with programs for example,a couple of missing bits can mean a sqaushed non functioning program.

Is data integrity lost with these 'dropped packets ? Mean with certain data types as with compression,the precision of the data is not that important. But with programs for example,a couple of missing bits can mean a sqaushed non functioning program.

Not so at all. each time the receiver gets a packet, they send back a receipt. Until the sender see that receipt, he will continue to send a new copy of the packet. At least, that is TCP. There is also UDP however, and it will send each packet only once. The latter is better used for things like streaming media, where each packet contains redundant data. This means that as long as the loss (or out of sequence packets) stays below a certain percentage pr timeframe, the receiver can recreate the content of the lost packets from the data in those that do arrive.

This specifically has to do with TCP congestion as UDP being a connectionless protocol, just ends as fast as the application is set to and the hardware can manage. UDP is used for a lot of real-time stuff like voice where the reduced overhead of UDP is preferred and losing a packet in the data stream does not result in significant issues at the receiver end. The reason voice is a good fit is that when you are talking to someone on the phone, the last thing you want is for a packet of voice data that was sent to you 120ms ago to be resent by the transmission layer and come in out-of-order within the context of the conversation. Thus, UDP applications like voice will simply ignore any packets received out of order, continuing on as if they never came in. This can create choppy audio but the audio stream always moves forward in time.

UDP does not, in itself, have any congestion control mechanism. Protocols built atop UDP must handle congestion in their own way. Protocols atop UDP which transmit at a fixed rate, independent of congestion, can be troublesome. Real-time streaming protocols, including many Voice over IP protocols, have this property.

That is odd there is a name for this, delay drop has been around for a while and is commonly implemented. I guess there is a difference between some setup and messaging of this number vs. a single node supporting it.

This has been known for a while and there are multiple papers on the subject. I know because I am the author of one! The issue has always been overcoming the immense inertia of existing protocol deployment (TCP) and backwards compatibility...