Engineering, Science, and Society

For Dave, the QoS update

I've been using Fireqos for my home network. Since switching recently to Gigabit fiber it required a lot of reconfiguring of my internal network. In the process I discovered a few things:

Typical consumer level routers from even a few years ago can't even begin to handle a gigabit through their firewall. You need something with an x86 type processor or a very modern ARM based consumer router. My Buffalo router could push about 150Mbps through the firewall at most.

QoS is still important at gigabit speeds. You can push a lot of data into buffers very quickly. Furthermore keeping things well paced actually allows you to go faster because acks make it back to where they're going.

Don't forget the effect of crappy cables. Replace your patch cables that you have lying around that came with whatever stuff you used to have with something good. I made my own patch cables with a crimp tool and high quality Cat5e, and it improved packet loss issues that may have been an issue before as well.

As Dave Taht suggested, switching from pfifo to fq_codel helped for the ssh connection class. In particular, I had been thinking of this class as mainly handling keystrokes and things for ssh sessions, but of course scp and rsync both like to push data over ssh. Because of that, I needed to put an fq_codel qdisc on the ssh class so my keystrokes would make it even when some rsync was going.

Too many things have changed at once for me to know whether fq_codel would have any affect on my voip RTP queue. But I suspect not. Every 0.02 seconds it'll send a single udp packet for each call. Each packet is around 1000 bytes. There are typically 1-4 calls at most. They jump to the front of the line due to the QoS and so the queue is never going to have more than 1 or 2 packets in it. The overhead of fq_codel makes no sense when the queue never gets longer than 3 packets and never takes longer than .00002 seconds to drain. If I have any issues though, I'll revisit.

thx for being willing to do the experiment, challenge your assumptions and find out AND document the edge cases in your setup!

we've been pushing fq technology to its limit with cake, recently adding de-natting and host isolation modes, so that *in theory*, all the work you just did is reduced to a single command line, that works without any classification at all, due to the 8 way set associative hash and mildly modified fq_codel algorithm.

For those who are trying to get the best quality voice over IP experience, I STRONGLY suggest several things:

1) Physically separate the phone traffic onto an ethernet link that doesn't share bandwidth with any device capable of using the full link bandwidth. So for example, I have my ATA connected to my Buffalo router, which is then also connected to the two WiFi radios. But the WiFi radios are 802.11n not ac and the full bandwidth of the gigabit link just can't be filled by WiFi hosts. That then links back to a managed switch. Previously I had also run my desktop computer through the switch on the Buffalo router. The desktop computer during an NFS file transfer CAN saturate the switch, and this is NOT a smart switch that will obey DSCP/CoS. Now I have a separate dumb switch for the printer and the desktop machine (and potentially any additional laptops etc I might want to wire in) and this desktop switch links directly to the smart switch.

2) Mark your voice traffic with DSCP=48 (this can be done in CSipSimple, and in most ATA devices), although 46 is the standard for voice, Linux wifi clients don't use the WMM VOICE queue for WiFI traffic with DSCP=46, but they do with 48.

3) Use a managed switch with layer 2 prioritization that puts DSCP=48 into the top queue and/or handles traffic from the voice LAN port via CoS. Even before you hit the router, where the linux QoS stuff can help, you can still suffer from queue problems. Make things smart at the ethernet layer as well.

4) Either use fireqos and several classes, including the use of DSCP to identify voice traffic, or definitely consider "cake" as Dave describes. That seems nice, and it uses the DSCP info with the right options. My main concern about the technical info Dave put up is that as link speeds increase you don't really need to scale up your voice bandwidth. Typically a site has a certain number of calls it needs to provision (for me for example it's probably about 4 or 5) and 1/4 of my gigabit link is WAY more than I ever want used for voice. I realize it will share its bandwidth, but I actually want to put a ceiling of a few Mbps on my voice traffic, way less than 25% for me, but perhaps even more than 25% for someone like my mother who has 1500kps ADSL uplink.

5) I'd recommend to use a router with at least 2 bonded NICs and 2 VLANs. The alternative is one NIC on the WAN side and one NIC on the LAN side. With that situation, you have both the possibility for a single host on your LAN to saturate the LAN link, starving things at layer 2, AND you have no redundancy on either side of the router. Using a bonded link and a separate WAN and LAN VLAN tag gives you redundancy on BOTH "sides" of the router, and reduces the possibility that a single LAN host will saturate your LAN link at layer 2. The switch is still a single point of failure, but switches are at least easy to plug into a UPS and generally operate for long periods without failure because they're so single-purpose.