This post is a partial excerpt from the QoS section of our IEWB-RS VOL1 V5. We’ll skip the discussion of how much of a legacy the Custom Queue is, and get stright to the working details. Custom queue feature is similar to WFQ in that it tries to share the bandwidth between packet flows using the max min approach: each flow class gets the guaranteed share proportional to its weight plus any class may claim the “unused” interface bandwidth. However, unlike WFQ, there are no dynamic conversations, just 16 static queues with configurable classification criteria. Custom Queue assigns byte counter to every queue (1500 bytes is the default value) and serves the queues in round-robin fashion, proportional to counters. Every dequeued packet decrements queue byte count by its size, until it drops down to zero. Custom Queuing feature supports additional system queue, number 0. System queue is the priority queue and is always served first, before of other (regular) queues. By default, system queue is used for layer 2 keepalives, but not for routing update packets (e.g. RIP, OSPF, EIGRP). Therefore, it’s recommended to map the routing update packets to system queue 0 manually, unless the interface is Frame-Relay, which uses special broadcast queue to send broadcasts. Note that all unclassified packets are by default assigned to queue 1 (e.g. routing updates will use this queue, unless mapped to some other queue), if the default queue number is not changed.

The limitation of round robin scheduling is that it can’t naturally dequeue less than one packet (quantum) from each queue. Since queues may have different average packet sizes (e.g. voice packets of 60 bytes and TCP packets of 1500 bytes) this may lead to undesirable bandwidth distribution ratio (e.g. if queue byte counter is 100 bytes and packet of 1500 bytes is in the queue, the packet will still be sent, since the counter is non zero). In order to make the distribution “fair”, every queue’s byte counter should be proportional to the queue’s average packet size. To make that happen, try to classify traffic so that packets in every queue have approximately the same packet size. Next, use the following example to calculate byte counters.

Consider the following scenario:

• Configure CQ as the queuing mechanics on the connection between R4 and R5 which is 128Kbps
• Ensure RIP routing updates use the system queue
• Voice packets should be guaranteed 30% of interface bandwidth.
• Classify voice packets based on their size of 60 bytes (G.729 codec)
• Allocate 60% of interface bandwidth to HTTP transfers from WWW servers on VLAN146
• The remaining 10% should be allocated to ICMP packets of average size 100 bytes
• Make sure the remaining traffic uses the same queue as ICMP packets
• Limit the queue size to 10 packets for all queues
• Set the IP MTU to 156 bytes in order to enforce the maximum serialization delay of 10ms

The voice packets have layer 3 size of 60 bytes (G.729 samples with 20 bytes per packet – 50pps), the TCP packets have layer 3 sizes 156 bytes (due to the requirement of setting MTU size to allow 10ms serialization) and ICMP packets have layer 3 sizes of 100 bytes. (Note that custom queue byte counter takes in consideration Layer 2 overhead. This is true with all legacy queuing techniques). For HDLC protocol the overhead is 4 bytes, thus the counters should be based on frames sizes of 64, 160 and 104 bytes respectively. The desired distribution ratio is 30:60:10 and the following procedure applies:

1) Because the byte counters must be proportional to packet size, the following should keep true: C1=a*64, C2=b*160; C3=c*104 where “a”, “b” and “c” are the multipliers we are looking for and C(i) is the byte-counter for queue “i”.

2) From the proportion C1:C2:C3=30:60:10 we quickly get that a=30/64=0,47; b=60/160=0,38; c=10/104=0,096. Now the problem is that those multiplies are fractional, and we can’t hold a fractional number of packets in a queue.

3) To get the optimal whole number, we must first normalize the fractional numbers, dividing by the smallest one. In our case this is c=0,096. Therefore now a1=a/c=4,89; b1=b/c=3,95; c1=1.

4) It is possible to simple round up the above numbers to the next integer (ceiling function) and use them as the number of packets that should be send every turn. However, what if we got numbers like 3,5 or 4,2? By rounding them up, we can get the integers that are far too larger than the initial values. To avoid this, we could scale (multiply) all numbers by some integer value, e.g. by 2 or 3, obtaining a value closer to integer.

5) In our case, it is even possible to multiply by 100 and get the exact whole numbers (489, 395 and 100 packets). However, this would yield excessively high byte counts, and will result in considerable delays between round-robin scheduler runs if queues are congested. Also, when calculating final byte counts, make sure your queue depths allow for big counters in case you still want to use them.

6) Therefore we just set multipliers to a=5, b=4, c=1. The resulting byte counters are C1=a*64=320, C2=b*160=640, C3=c*104=104. Let’s calculate the weights each queue obtains with this configuration;

This is what we want – the proportions of 30:60:10. Also, the maximum byte counter is 640 bytes, so the maximum possible delay would be based on 4×160 byte packets, resulting in 40ms value (not too VoIP friendly, huh).

When you configure custom-queue mappings of traffic classes to queue-ids, all statements are evaluated sequentially, and the first one matched by the packet determines the queue where it goes. Thus is, we you match packets less than 65 bytes (e.g. queue-list 1 protocol ip 1 lt 65) before you match ICMP packets (e.g. using an access-list), then all ICMP packets of size less that 65 bytes will be classified using the first statement. Pay special attention that the packet size criterion with legacy QoS commands is based on full layer 2 length – this is true with all legacy queuing techniques. In this case HDLC header is 4 bytes, and thus voice packets have full layer 2 size of 64 bytes. Also, when you specify a port number to be matched for CQ classification, it matches both source or destination port.

To verify your settings, configure R6 to source stream of G.729-sized packets to R5, measuring jitter and RTT. The G.729 stream consumes slightly more than 24Kbps of bandwidth. R5 should respond to those RTP probes. At the same time, configure R1 as HTTP server and allow other routers to access its IOS image in the flash memory.

Configure R5 to meter the incoming traffic rate. Create three traffic classes to match the packets generated by the SLA probe, WWW traffic and ICMP packets. Assign the classes to a policy map and attach it as a service policy to the serial interface of R5. Note that MQC matches L3 packet size (e.g. 60 bytes for G.729) without any L2 overhead.

Check the current queues utilization. Note that almost all queues have non-zero depth, and the queue for ICMP packets has the maximum amount of packet drops. Also note that the voice packets (queue 1) are queue too, but the drop count is very low.

It’s even possible to view the current queue contents. Use the command show queue[interface] [x/y] [qNumber]. Specify the custom queue number in range 0-16, where 0 is for the system queue. For example, look at queue 2 (WWW packets in this case) contents:

Look at R5 to see the resulting bandwidth allocation. Note that the voice class bandwidth is close to 24Kbps and it does not exceed its allocated share (0,3*128K). Therefore, other classes may claim the remaining bandwidth and this is explains why WWW traffic has more bandwidth than its guaranteed share (0,6*128K).

Note the ICMP packet flood gets 13Kbps of bandwidth and does not seriously impair other traffic flows, unlike with WFQ scheduling. This is due to the fact that each class has its separate FIFO queue and traffic flows don’t share the same buffer space (plus there is no congestive discard procedure). The aggressive ICMP traffic behavior does not seriously affect other queues, since the exceeding traffic is simply dropped.

In total, all classes draw approximately 120Kbps of L3 bandwidth, which is naturally less than the configured 128Kbps line rate

Clear counters on the serial interface, and configure the custom queue to use priority treatment for queue 1. This is possible by the means of lowest-custom option. This option specifies the number of custom queue, which is the first to start round robin scheduling. All queues up to this number will use priority scheduling, with queue 0 being the most important, queue 1 being the next important, etc. Alternatively, it’s possible to map the “voice” packets to queue 0, which is always priority.

Note, that even though this allows for enabling priority queue with CQ, it’s not recommended. The problem is that priority queues in CQ are not limited in any way, and thus may effectively starve other queues. Voice traffic should use either IP RTP Priority or LLQ mechanisms for priority queue

To summarize, Custom Queue aims to implement a version of max min bandwidth sharing. It allows classifying packets to queues, and assigning byte counters, that are effectively queue weights. We’ve seen that with classic round robin scheduling this may lead to unfair bandwidth allocation, unless average packets sizes are taken in account. Howeve, since IOS 12.1 the custom queue implementation actually uses something close to “Deficit Round Robin”, which tracks the amount of excessive bytes consumed by every queue and takes that in account in next scheduling rounds. Therefore, when configuring CQ in modern IOS versions, it’s not necessary to set byte counts proportional to average packets sizes. Still it’s nice to know how to compute those byte counters

About Petr Lapukhov, 4xCCIE/CCDE:

Petr Lapukhov's career in IT begain in 1988 with a focus on computer programming, and progressed into networking with his first exposure to Novell NetWare in 1991. Initially involved with Kazan State University's campus network support and UNIX system administration, he went through the path of becoming a networking consultant, taking part in many network deployment projects. Petr currently has over 12 years of experience working in the Cisco networking field, and is the only person in the world to have obtained four CCIEs in under two years, passing each on his first attempt. Petr is an exceptional case in that he has been working with all of the technologies covered in his four CCIE tracks (R&S, Security, SP, and Voice) on a daily basis for many years. When not actively teaching classes, developing self-paced products, studying for the CCDE Practical & the CCIE Storage Lab Exam, and completing his PhD in Applied Mathematics.

The same goes to all legacy QoS technologies You want see many of those around nowdays, although I still use FR PIPQ at same places

However, I thkink it’s really useful to see how technologies evolved in their historical perspective. CQ is a nice illustration of RR scheduling, and seeing how it works gives people more insights on max-min resource sharing in general.

Besides, nobody says that aloud, but CBWFQ is really just the good old WFQ with some “enhancements”. Likewise, LLQ maps directly to IP RTP Priority. However, you can hardly find any information on internals of CBWFQ: e.g. how it computes weights values, how it handles Link Queues, and how it works when you have flow-based WFQ and manual classes configured at the same time.

Based on that, I believe that introducing new concepts following the evolutionary steps is much easier to understand than starting a breakdown of a new technlogy from a scratch.

Why use packet size to classify VOIP traffic rather than an ACL? Wouldn’t any small packet match this queue (telnet)? Also, the requirement says to give 10% bandwidth to ICMP which is assigned to queue 3. But then queue 3 is made the default queue which would receive all unclassified traffic wouldn’t it? Why not create a separate default queue? Thanks.

Leave a Reply

Currently you have JavaScript disabled. In order to post comments, please make sure JavaScript and Cookies are enabled, and reload the page.Click here for instructions on how to enable JavaScript in your browser.