Queueing in the Linux Network Stack

Packet queues are a core component of any network stack or device. They
allow for asynchronous modules to communicate, increase performance and
have the side effect of impacting latency. This article aims to explain
where IP packets are queued on the transmit path of the Linux network
stack, how interesting new latency-reducing features, such as BQL, operate
and how to control buffering for reduced latency.

Figure 1. Simplified High-Level Overview of the Queues on the Transmit
Path of the Linux Network Stack

Driver Queue (aka Ring Buffer)

Between the IP stack and the network interface controller (NIC) lies the
driver queue. This queue typically is implemented as a first-in, first-out
(FIFO) ring buffer (http://en.wikipedia.org/wiki/Circular_buffer)—just think of it as a
fixed-sized buffer. The driver
queue does not contain the packet data. Instead, it consists of descriptors
that point to other data structures called socket kernel buffers (SKBs,
http://vger.kernel.org/%7Edavem/skb.html),
which hold the packet data and are used throughout the kernel.

The input source for the driver queue is the IP stack that queues IP
packets. The packets may be generated locally or received on one NIC to be
routed out another when the device is functioning as an IP router. Packets
added to the driver queue by the IP stack are dequeued by the hardware
driver and sent across a data bus to the NIC hardware for transmission.

The reason the driver queue exists is to ensure that whenever the
system has data to transmit it is available to the NIC for immediate
transmission. That is, the driver queue gives the IP stack a location
to queue data asynchronously from the operation of the hardware. An
alternative design would be for the NIC to ask the IP stack for data
whenever the physical medium is ready to transmit. Because responding
to this request cannot be instantaneous, this design wastes valuable
transmission opportunities resulting in lower throughput. The opposite
of this design approach would be for the IP stack to wait after a packet
is created until the hardware is ready to transmit. This also is not
ideal, because the IP stack cannot move on to other work.

Huge Packets from the Stack

Most NICs have a fixed maximum transmission unit (MTU), which is the
biggest frame that can be transmitted by the physical media. For Ethernet,
the default MTU is 1,500 bytes, but some Ethernet networks support Jumbo
Frames (http://en.wikipedia.org/wiki/Jumbo_frame) of up to 9,000 bytes. Inside the IP network stack, the MTU can
manifest as a limit on the size of the packets that are sent to the
device for transmission. For example, if an application writes 2,000
bytes to a TCP socket, the IP stack needs to create two IP packets
to keep the packet size less than or equal to a 1,500 MTU. For large
data transfers, the comparably small MTU causes a large number of small
packets to be created and transferred through the driver queue.

In order to avoid the overhead associated with a large number of packets
on the transmit path, the Linux kernel implements several optimizations:
TCP segmentation offload (TSO), UDP fragmentation offload (UFO) and
generic segmentation offload (GSO). All of these optimizations allow the
IP stack to create packets that are larger than the MTU of the outgoing
NIC. For IPv4, packets as large as the IPv4 maximum of 65,536 bytes can
be created and queued to the driver queue. In the case of TSO and UFO,
the NIC hardware takes responsibility for breaking the single large
packet into packets small enough to be transmitted on the physical
interface. For NICs without hardware support, GSO performs the same
operation in software immediately before queueing to the driver queue.

Recall from earlier that the driver queue contains a fixed number of
descriptors that each point to packets of varying sizes. Since TSO,
UFO and GSO allow for much larger packets, these optimizations have the
side effect of greatly increasing the number of bytes that can be queued
in the driver queue. Figure 3 illustrates this concept in contrast with
Figure 2.

Figure 3. Large packets can be sent to the NIC when TSO, UFO or GSO
are enabled. This can greatly increase the number of bytes in the
driver queue.

Although the focus of this article is the transmit path, it is worth
noting that Linux has receive-side optimizations that operate
similarly to TSO, UFO and GSO and share the goal of reducing per-packet
overhead. Specifically, generic receive offload (GRO,
http://vger.kernel.org/%7Edavem/cgi-bin/blog.cgi/2010/08/30) allows the NIC
driver to combine received packets into a single large packet that is
then passed to the IP stack. When the device forwards these large packets,
GRO allows the original packets to be reconstructed, which is necessary
to maintain the end-to-end nature of the IP packet flow. However, there
is one side effect: when the large packet is broken up, it results in
several packets for the flow being queued at once. This
"micro-burst"
of packets can negatively impact inter-flow latency.

Comment viewing options

We are very happy to share my opinion on your blog. We are professionally committed in providing Boutique products to our visitors online including service charges. If you are here then visit our blog.

Interesting article. Your post affects many "burning" issues in our society. We can't be indifferent to these challenges. There are many articles out there on this particular point, but you have captured different sides of the topic. This post gives a lot of awesome information and inspiration. I really enjoyed simply reading.http://lamisiltablets.info/

I found this is an informative and interesting post so i think so it is very useful and knowledgeable. I would like to thank you for the efforts you have made in writing this article http://www.actual-braindumps.com braindumps

The cotton bag at http://www.irisweb.co.uk has a lovely and valuable property. It is feasible to recycle and reuse the cotton bag. It is speedy. You can make use of the cotton bag for numerous significant purposes.

Thank you for the effort you have made ​​in creating this blog, better shared information that's also one of the values ​​of democracy ... if I can do anything to help this site I 'd be happy .. Good luck !

Trending Topics

Upcoming Webinar

Getting Started with DevOps - Including New Data on IT Performance from Puppet Labs 2015 State of DevOps Report

August 27, 2015
12:00 PM CDT

DevOps represents a profound change from the way most IT departments have traditionally worked: from siloed teams and high-anxiety releases to everyone collaborating on uneventful and more frequent releases of higher-quality code. It doesn't matter how large or small an organization is, or even whether it's historically slow moving or risk averse — there are ways to adopt DevOps sanely, and get measurable results in just weeks.