Complex Traffic Shaping/Control

TomEastep

ArneBernin

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version
1.2 or any later version published by the Free Software Foundation; with
no Invariant Sections, with no Front-Cover, and with no Back-Cover
Texts. A copy of the license is included in the section entitled
“GNU Free Documentation
License”.

Important

Traffic shaping is complex and the Shorewall community is not well
equipped to answer traffic shaping questions. So if you are the type of
person who needs "insert tab A into slot B" instructions for everything
that you do, then please don't try to implement traffic shaping using
Shorewall. You will just frustrate yourself and we won't be able to help
you.

Warning

Said another way, reading just Shorewall documentation is not going
to give you enough background to use this material.

At a minimum, you will need to refer to at least the following
additional information:

Introduction

Beginning with Shorewall 4.4.6, Shorewall includes two separate
implementations of traffic shaping. This document describes the original
implementation which is complex and difficult to configure. A much simpler
version is described in Simple Traffic Shaping/Control
and is highly recommended unless you really need to delay certain traffic
passing through your firewall.

Shorewall has builtin support for traffic shaping and control. This
support does not cover all options available (and especially all
algorithms that can be used to queue traffic) in the Linux kernel but it
should fit most needs. If you are using your own script for traffic
control and you still want to use it in the future, you will find
information on how to do this, later in this
document. But for this to work, you will also need to enable
traffic shaping in the kernel and Shorewall as covered by the next
sections.

Linux traffic shaping and control

This section gives a brief introduction of how controlling traffic
with the Linux kernel works. Although this might be enough for configuring
it in the Shorewall configuration files, we strongly recommend that you
take a deeper look into the Linux
Advanced Routing and Shaping HOWTO. At the time of writing this,
the current version is 1.0.0.

Since kernel 2.2, Linux has extensive support for controlling
traffic. You can define different algorithms that are used to queue the
traffic before it leaves an interface. The standard one is called pfifo
and is (as the name suggests) of the type First In First out. This means,
that it does not shape anything, if you have a connection that eats up all
your bandwidth, this queuing algorithm will not stop it from doing
so.

For Shorewall traffic shaping we use three algorithms: HTB
(Hierarchical Token Bucket), HFSC (Hierarchical Fair Service Curves) and
SFQ (Stochastic Fairness Queuing). SFQ is easy to explain: it just tries
to track your connections (tcp or udp streams) and balances the traffic
between them. This normally works well. HTB and HFSC allow you to define a
set of classes, and you can put the traffic you want into these classes.
You can define minimum and maximum bandwidth settings for those classes
and order them hierarchically (the less prioritized classes only get
bandwidth if the more important have what they need). Additionally, HFSC
allows you to specify the maximum queuing delay that a packet may
experience. Shorewall builtin traffic shaping allows you to define these
classes (and their bandwidth limits), and it uses SFQ inside these classes
to make sure, that different data streams are handled equally. If SFQ's
default notion of a 'stream' doesn't work well for you, you can change it
using the flow option described below.

You can shape incoming traffic through use of an
Intermediate Functional Block (IFB) device. See below. But beware: using an
IFB can result in queues building up both at your ISPs router and at your
own.

You shape and control outgoing traffic by assigning the traffic to
classes. Each class is associated with exactly one
network interface and has a number of attributes:

PRIORITY - Used to give preference to one class over another
when selecting a packet to send. The priority is a numeric value with
1 being the highest priority, 2 being the next highest, and so
on.

RATE - The minimum bandwidth this class should get, when the
traffic load rises. Classes with a higher priority (lower PRIORITY
value) are served even if there are others that have a guaranteed
bandwidth but have a lower priority (higher PRIORITY value).

CEIL - The maximum bandwidth the class is allowed to use when
the link is idle.

MARK - Netfilter has a facility for
marking packets. Packet marks have a numeric
value which is limited in Shorewall to the values 1-255 (1-16383 if
you set WIDE_TC_MARKS=Yes in shorewall.conf (5) ). You
assign packet marks to different types of traffic using entries in the
/etc/shorewall/mangle file (Shorewall 4.6.0 or
later) or /etc/shorewall/tcrules (Prior to
Shorewall 4.6.0).

Note

In Shorewall 4.4.26, WIDE_TC_MARKS was superseded by TC_BITS
which specifies the width in bits of the traffic shaping mark field.
The default is based on the setting of WIDE_TC_MARKS so as to
provide upward compatibility. See the Packet Marking using
/etc/shorewall/mangle article.

One class for each interface must be designated as the
default class. This is the class to which unmarked
traffic (packets to which you have not assigned a mark value in
/etc/shorewall/mangle) is assigned.

Netfilter also supports a mark value on each connection. You can
assign connection mark values in
/etc/shorewall/mangle
(/etc/shorewall/tcrules), you can copy the current
packet's mark to the connection mark (SAVE), or you can copy the
connection mark value to the current packet's mark (RESTORE). For more
information, see this
article.

Linux Kernel Configuration

You will need at least kernel 2.4.18 for this to work, please take a
look at the following screenshot for what settings you need to enable. For
builtin support, you need the HTB scheduler, the Ingress scheduler, the
PRIO pseudoscheduler and SFQ queue. The other scheduler or queue
algorithms are not needed.

This screen shot shows how I configured QoS in a 2.6.16
Kernel:

And here's my recommendation for a 2.6.21 kernel:

Enable TC support in Shorewall

You need this support whether you use the builtin support or whether
you provide your own tcstart script.

To enable the builtin traffic shaping and control in Shorewall, you
have to do the following:

Set TC_ENABLED to "Internal" in /etc/shorewall/shorewall.conf.
Setting TC_ENABLED=Yes causes
Shorewall to look for an external tcstart file (See a later section for details).

Setting CLEAR_TC parameter in
/etc/shorewall/shorewall.conf to Yes
will clear the traffic shaping configuration during Shorewall
[re]start and Shorewall stop. This is normally what you want when
using the builtin support (and also if you use your own tcstart
script)

The other steps that follow depend on whether you use your own
script or the builtin solution. They will be explained in the
following sections.

Using builtin traffic shaping/control

Shorewall's builtin traffic shaping feature provides a thin layer on
top of the ingress qdesc, HTB and SFQ. That translation layer allows you
to:

Integrate the reloading of your traffic shaping configuration
with the reloading of your packet-filtering and marking
configuration.

Assign traffic to HTB or HFSC classes by TOS value.

Assign outgoing TCP ACK packets to an HTB or HFSC class.

Assign traffic to HTB and/or HFSC classes based on packet mark
value or based on packet contents.

Those few features are really all that builtin traffic
shaping/control provides; consequently, you need to understand HTB and/or
HFSC and Linux traffic shaping as well as Netfilter packet marking in
order to use the facility. Again, please see the links at top of this
article.

For defining bandwidths (for either devices or classes) please use
kbit or kbps (for Kilobytes per second) and make sure there is NO space between the number and the unit (it is
100kbit not 100 kbit). Using mbit, mbps
or a raw number (which means bytes) could be used, but note that only
integer numbers are supported (0.5 is not
valid).

To properly configure the settings for your
devices you need to find out the real up- and downstream rates you
have. This is especially the case, if you are using a DSL
connection or one of another type that do not have a guaranteed bandwidth.
Don't trust the values your provider tells you for this; especially
measuring the real download speed is important! There are several online
tools that help you find out; search for "dsl speed test" on google (For
Germany you can use arcor speed
check). Be sure to choose a test site located near you.

/etc/shorewall/tcdevices

This file allows you to define the incoming and outgoing bandwidth
for the devices you want traffic shaping to be enabled. That means, if
you want to use traffic shaping for a device, you have to define it
here. For additional information, see shorewall-tcdevices
(5).

Columns in the file are as follows:

INTERFACE - Name of interface. Each interface may be listed
only once in this file. You may NOT specify the name of an alias
(e.g., eth0:0) here; see FAQ #18.
You man NOT specify wildcards here, e.g. if you have multiple ppp
interfaces, you need to put them all in here! Shorewall will
determine if the device exists and will only configure the device if
it does exist. If it doesn't exist or it is DOWN, the following
warning is issued:

WARNING: Device <device name> is
not in the UP state -- traffic-shaping configuration
skipped

Shorewall assigns a sequential interface
number to each interface (the first entry in
/etc/shorewall/tcdevices is interface 1, the
second is interface 2 and so on) You can also explicitly specify the
interface number by prefixing the interface name with the number and
a colon (":"). Example: 1:eth0.

Warning

Device numbers are expressed in hexidecimal. So the device
following 9 is A, not 10.

IN-BANDWIDTH - The incoming Bandwidth of that interface.
Please note that when you use this column, you are not traffic
shaping incoming traffic, as the traffic is already received before
you could do so. This Column allows you to define the maximum
traffic allowed for this interface in total, if the rate is
exceeded, the excess packets are dropped. You want this mainly if
you have a DSL or Cable Connection to avoid queuing at your
providers side. If you don't want any traffic to be dropped set this
to a value faster than your interface maximum rate, or to 0
(zero).

To determine the optimum value for this setting, we recommend
that you start by setting it significantly below your measured
download bandwidth (20% or so). While downloading, measure the
ping response time from the firewall to the
upstream router as you gradually increase the setting.The optimal
setting is at the point beyond which the ping
time increases sharply as you increase the setting.

Note

For fast lines, the actually download speed may be well
below what you specify here. If you have this problem, then follow
the bandwidth with a ":" and a burst size.
The default burst is 10kb, but on my 50mbit line, I specify 200kb.
(50mbit:200kb).

OUT-BANDWIDTH - Specify the outgoing bandwidth of that
interface. This is the maximum speed your connection can handle. It
is also the speed you can refer as "full" if you define the tc
classes. Outgoing traffic above this rate will be dropped.

OPTIONS — A comma-separated list of options from the following
list:

classify

If specified, classification of traffic into the various
classes is done by CLASSIFY entries in
/etc/shorewall/mangle
(/etc/shorewall/tcrules) or by entries in
/etc/shorewall/tcfilters. No MARK value
will be associated with classes on this interface.

Added in Shorewall 4.5.6. Type of link (ethernet, atm,
adsl). When specified, causes scheduler packet size
manipulation as described in tc-stab (8). When this option is
given, the following options may also be given after
it:

mtu=mtu

The device MTU; default 2048 (will be rounded up
to a power of two)

mpu=mpubytes

Minimum packet size used in calculations. Smaller
packets will be rounded up to this size

tsize=tablesize

Size table entries; default is 512

overhead=overheadbytes

Number of overhead bytes per packet

REDIRECTED INTERFACES — Entries are appropriate in this column
only if the device in the INTERFACE column names a Intermediate Functional Block (IFB). It lists
the physical interfaces that will have their input shaped using
classes defined on the IFB. Neither the IFB nor any of the
interfaces listed in this column may have an IN-BANDWIDTH specified.
You may specify zero (0) or a dash ("-:) in the IN-BANDWIDTH
column.

IFB devices automatically get the classify option.

Example 1.

Suppose you are using PPP over Ethernet (DSL) and ppp0 is the
interface for this. The device has an outgoing bandwidth of 500kbit
and an incoming bandwidth of 6000kbit

#INTERFACE IN-BANDWITH OUT-BANDWIDTH
ppp0 6000kbit 500kbit

/etc/shorewall/tcclasses

This file allows you to define the actual classes that are used to
split the outgoing traffic. For additional information, see shorewall-tcclasses
(5).

INTERFACE - Name of interface. Users may also specify the
interface number. Must match the name (or number) of an interface
with an entry in /etc/shorewall/tcdevices. If
the interface has the classify
option in /etc/shorewall/tcdevices, then the
interface name or number must be followed by a colon and a
class number. Examples: eth0:1, 4:9. Class
numbers must be unique for a given interface. Normally, all classes
defined here are sub-classes of a root class that is implicitly
defined from the entry in shorewall-tcdevices(5). You
can establish a class hierarchy by specifying a
parent class (e.g.,
interface:parent-class:class)
-- the number of a class that you have previously defined. The
sub-class may borrow unused bandwidth from its parent.

Warning

Class numbers are expressed in hexidecimal. So the class
following class 9 is A, not 10.

MARK - The mark value which is an integer in the range 1-255
(1-16383 if you set WIDE_TC_MARKS=Yes or set TC_BITS=14 in shorewall.conf (5) ). You
define these marks in the mangle or tcrules file, marking the
traffic you want to go into the queuing classes defined in here. You
can use the same marks for different Interfaces. You must specify
"-' in this column if the device specified in the INTERFACE column
has the classify option in
/etc/shorewall/tcdevices.

Note

In Shorewall 4.5.0, WIDE_TC_MARKS was superseded by TC_BITS
which specifies the width in bits of the traffic shaping mark
field. The default is based on the setting of WIDE_TC_MARKS so as
to provide upward compatibility.

RATE - The minimum bandwidth this class should get, when the
traffic load rises. Please note that first the classes which equal
or a lesser priority value are served even if there are others that
have a guaranteed bandwidth but a lower priority. If the sum of the RATEs for all classes assigned to an
INTERFACE exceed that interfaces's OUT-BANDWIDTH, then the
OUT-BANDWIDTH limit will not be honored.

When using HFSC, this column may contain 1, 2 or 3 pieces of
information separated by colons (":"). In addition to the minimum
bandwidth, leaf classes may specify realtime criteria: DMAX (maximum
delay in milliseconds) and optionally UMAX (the largest packet
expected in the class). See below for
details.

CEIL - The maximum bandwidth this class is allowed to use when
the link is idle. Useful if you have traffic which can get full
speed when more important services (e.g. interactive like ssh) are
not used. You can use the value "full" in here for setting the
maximum bandwidth to the defined output bandwidth of that
interface.

PRIORITY - you have to define a priority for the class.
packets in a class with a higher priority (=lesser value) are
handled before less prioritized ones. You can just define the mark
value here also, if you are increasing the mark values with lesser
priority.

OPTIONS - A comma-separated list of options including the
following:

default - this is the default class for that interface
where all traffic should go, that is not classified
otherwise.

Note

defining default for exactly one class per interface is
mandatory!

tos-<tosname> - this lets you define a filter for
the given <tosname> which lets you define a value of the
Type Of Service bits in the ip package which causes the package
to go in this class. Please note, that this filter overrides all
mark settings, so if you define a tos filter for a class all
traffic having that mark will go in it regardless of the mark on
the package. You can use the following for this option:
tos-minimize-delay (16) tos-maximize-throughput (8)
tos-maximize-reliability (4) tos-minimize-cost (2)
tos-normal-service (0)

Note

Each of this options is only valid for one class per interface.

tcp-ack - if defined causes an tc filter to be created
that puts all tcp ack packets on that interface that have an
size of <=64 Bytes to go in this class. This is useful for
speeding up downloads. Please note that the size of the ack
packets is limited to 64 bytes as some applications (p2p for
example) use to make every package an ack package which would
cause them all into here. We want only packets WITHOUT payload
to match, so the size limit. Bigger packets just take their
normal way into the classes.

Note

This option is only valid for class per interface.

occurs=number - Typically used with
an IPMARK entry in mangle or tcrules. Causes the rule to be
replicated for a total of number rules.
Each rule has a successively class number and mark value.

When 'occurs' is used:

The associated device may not have the 'classify'
option.

The class may not be the default class.

The class may not have any 'tos=' options (including
'tcp-ack').

The class should not specify a MARK value. If one is
specified, it will be ignored with a warning message.

The 'RATE' and 'CEIL' parameters apply to each instance of
the class. So the total RATE represented by an entry with
'occurs' will be the listed RATE multiplied by
number. For additional information, see
mangle (5)
or tcrules
(5).

flow=keys - Shorewall attaches an SFQ
queuing discipline to each leaf HTB and HFSC class. SFQ ensures
that each flow gets equal access to the
interface. The default definition of a flow corresponds roughly
to a Netfilter connection. So if one internal system is running
BitTorrent, for example, it can have lots of 'flows' and can
thus take up a larger share of the bandwidth than a system
having only a single active connection. The
flow classifier (module cls_flow) works around
this by letting you define what a 'flow' is. The clasifier must
be used carefully or it can block off all traffic on an
interface! The flow option can be specified for an HTB or HFSC
leaf class (one that has no sub-classes). We recommend that you
use the following:

Shaping internet-bound traffic: flow=nfct-src

Shaping traffic bound for your local net: flow=dst

These will cause a 'flow' to consists of the traffic
to/from each internal system.

When more than one key is give, they must be enclosed in
parenthesis and separated by commas.

To see a list of the possible flow keys, run this
command:

tc filter add flow help

Those that begin with "nfct-" are Netfilter connection
tracking fields. As shown above, we recommend flow=nfct-src;
that means that we want to use the source IP address
before SNAT as the key.

Note

Shorewall cannot determine ahead of time if the flow
classifier is available in your kernel (especially if it was
built into the kernel as opposed to being loaded as a module).
Consequently, you should check ahead of time to ensure that
both your kernel and 'tc' utility support the feature.

If 'flow' is supported, no output is produced;
otherwise, you will see:

FATAL: Module cls_flow not found.

If your kernel is not modularized or does not support
module autoloading, look at your kernel configuration (either
/proc/config.gz or the
.config file in /lib/modules/<kernel-version>/build/

If 'flow' is supported, you will see: NET_CLS_FLOW=m or
NET_CLS_FLOW=y.

For modularized kernels, Shorewall will attempt to load
/lib/modules/<kernel-version>/net/sched/cls_flow.ko
by default.

pfifo - When specified for a leaf class, the pfifo queing
discipline is applied to the class rather than the sfq queuing
discipline.

limit=number - Added in Shorewall
4.4.3. When specified for a leaf class, specifies the maximum
number of packets that may be queued within the class. The
number must be > 2 and less than 128. If
not specified, the value 127 is assumed

red=(redoption,...) - Added in
Shorewall 4.5.6. When specified on a leaf class, causes the
class to use the red queuing discipline rather than SFQ. See
tc-red (8) for additional information.

fq_codel[=(codeloption,...)] -
Added in Shorewall 4.5.12. When specified on a leaf class,
causes the class to use the FQ CODEL (Fair-queuing
Controlled-delay) queuing discipline rather than
SFQ. See tc-fq_codel (8) for additional information.

/etc/shorewall/mangle and /etc/shorewall/rules

Important

Unlike rules in the shorewall-rules(5) file,
evaluation of rules in this file will continue after a match. So the
final mark for each packet will be the one assigned by the LAST tcrule
that matches.

Also unlike rules in the shorewall-rules(5) file,
the mangle (tcrules) file is not stateful. So every packet that goes
into, out of or through your firewall is subject to entries in the
mangle (tcrules) file.

Because mangle (tcrules) entries are not stateful, it is
necessary to understand basic IP socket operation. Here is an edited
excerpt from a post on the Shorewall Users list:

For the purposes of this discussion, the world is separated
into clients and servers. Servers provide services to
clients.

When a server starts, it creates a socket and
binds the socket to an
address. For AF_INET (IPv4) and AF_INET6
(IPv6) sockets, that address is an ordered triple consisting of an
IPv4 or IPv6 address, a protocol, and possibly a port number. Port
numbers are only used when the protocol is TCP, UDP, SCTP or DCCP.
The protocol and port number used by a server are typically
well-known so that clients will be able to connect to it or send
datagrams to it. So SSH servers bind to TCP port 22, SMTP servers
bind to TCP port 25, etc. We will call this port the SERVER
PORT.

When a client want to use the service provided by a server,
it also creates a socket and, like the server's socket, the
client's socket must be bound to an address. But in the case of
the client, the socket is usually given an automatic address
binding. For AF_INET and AF_INET6 sockets. the IP address is the
IP address of the client system (loose generalization) and the
port number is selected from a local port
range. On Linux systems, the local port range can be
seen by cat
/proc/sys/net/ipv4/ip_local_port_range. So it is not
possible in advance to determine what port the client will be
using. Whatever it is, we'll call it the CLIENT PORT.

Now:

Packets sent from the client to the server will
have:

SOURCE PORT = CLIENT PORT

DEST PORT = SERVER PORT

Packets sent from the server to the client will have:

SOURCE PORT = SERVER PORT

DEST PORT = CLIENT PORT

Since the SERVER PORT is generally the only port known ahead
of time, we must categorize traffic from the server to the client
using the SOURCE PORT.

The fwmark classifier provides a convenient way to classify
packets for traffic shaping. The
/etc/shorewall/mangle
(/etc/shorewall/tcrules) file is used for
specifying these marks in a tabular fashion. For an in-depth look at the
packet marking facility in Netfilter/Shorewall, please see this article.

For marking forwarded traffic, you must
either set MARK_IN_FORWARD_CHAIN=Yes shorewall.conf or by using the :F
qualifier (see below).

See shorewall-mangle(5) and shorewall-tcrules(5) for a description
of the entries in these files. Note that the mangle file superseded the
tcrules file in Shorewall 4.6.0.

The following examples are for the mangle file.

Example 2.

All packets arriving on eth1 should be marked with 1. All
packets arriving on eth2 and eth3 should be marked with 2. All packets
originating on the firewall itself should be marked with 3.

This is a little more complex than otherwise expected. Since the
ipp2p module is unable to determine all packets in a connection are
P2P packets, we mark the entire connection as P2P if any of the
packets are determined to match. We assume packet/connection mark 0 to
means unclassified. Traffic originating on the firewall is not covered
by this example.

"If a packet hasn't been classified (packet mark is 0), copy
the connection mark to the packet mark. If the packet mark is set,
we're done. If the packet is P2P, set the packet mark to 4. If the
packet mark has been set, save it to the connection mark."

Example 7.

Mark all forwarded VOIP connections with connection mark 1 and
ensure that all VOIP packets also receive that mark (assumes that
nf_conntrack_sip is loaded).

ppp devices

If you use ppp/pppoe/pppoa) to connect to your Internet provider
and you use traffic shaping you need to restart shorewall traffic
shaping. The reason for this is, that if the ppp connection gets
restarted (and it usually does this at least daily), all
“tc” filters/qdiscs related to that interface are
deleted.

The easiest way to achieve this, is just to restart shorewall once
the link is up. To achieve this add a small executable script
to“/etc/ppp/ip-up.d”.

#! /bin/sh
/sbin/shorewall refresh

Sharing a TC configuration between Shorewall and
Shorewall6

Beginning with Shorewall 4.4.15, the traffic-shaping configuration
in the tcdevices, tcclasses and tcfilters files can be shared between
Shorewall and Shorewall6. Only one of the products can control the
configuration but the other can configure CLASSIFY rules in its own
mangle or tcrules file that refer to the shared classes.

To defined the configuration in Shorewall and shared it with
Shorewall6:

If you need to define IPv6 tcfilter entries, do so in
/etc/shorewall/tcfilters. That file now allows entries that apply to
IPv6.

Shorewall6 compilations to have access to the tcdevices and
tcclasses files although it will create no output. That access allows
CLASSIFY rules in /etc/shorewall6/mangle to be validated against the TC
configuration.

In this configuration, it is Shorewall that controls TC
configuration (except for IPv6 mangle). You can reverse the settings in
the files if you want to control the configuration using
Shorewall6.

Per-IP Traffic Shaping

Some network administrators feel that they have to divy up their
available bandwidth by IP address rather than by prioritizing the
traffic based on the type of traffic. This gets really awkward when
there are a large number of local IP addresses.

This section describes the Shorewall facility for making this
configuration less tedious (and a lot more efficient). Note that it
requires that you install
xtables-addons. So before you try this facility, we suggest that
first you add the following OPTION to each external interface described
in /etc/shorewall/tcdevices:

flow=nfct-src

If you shape traffic on your internal interface(s), then add this
to their entries:

flow=dst

You may find that this simple change is all that is needed to
control bandwidth hogs like Bit Torrent. If it doesn't, then proceed as
described in this section.

In a sense, the IPMARK target is more like an IPCLASSIFY target in
that the mark value is later interpreted as a class ID. A packet mark is
32 bits wide; so is a class ID. The major class
occupies the high-order 16 bits and the minor class
occupies the low-order 16 bits. So the class ID 1:4ff (remember that
class IDs are always in hex) is equivalent to a mark value of 0x104ff.
Remember that Shorewall uses the interface number as the
major number where the first interface in tcdevices
has major number 1, the second has
major number 2, and so on.

The IPMARK target assigns a mark to each matching packet based on
the either the source or destination IP address. By default, it assigns
a mark value equal to the low-order 8 bits of the source address.

The syntax is as follows:

IPMARK[([{src|dst}][,[mask1][,[mask2][,[shift]]]])]

Default values are:

src

mask1 = 0xFF

mask2 = 0x00

shift = 0

src and dst specify whether the mark is to be based on
the source or destination address respectively. The selected address is
first shifted right by shift, then LANDed with
mask1 and then LORed with
mask2. The shift argument is
intended to be used primarily with IPv6 addresses.

It is important to realize that, while class IDs are composed of a
major and a minor value, the
set of minor values must be unique. You must keep
this in mind when deciding how to map IP addresses to class IDs. For
example, suppose that your internal network is 192.168.1.0/29 (host IP
addresses 192.168.1.1 - 192.168.1.6). Your first notion might be to use
IPMARK(src,0xFF,0x10000) so as to produce class IDs 1:1 through 1:6. But
1:1 is the class ID of the base HTB class on interface 1. So you might
chose instead to use IPMARK(src,0xFF,0x10100) as shown in the example
above so as to avoid minor class 1.

The occurs option in
/etc/shorewall/tcclasses causes the class
definition to be replicated many times.

The synax is:

occurs=number

When occurs is used:

The associated device may not have the classify option.

The class may not be the default class.

The class may not have any tos= options (including tcp-ack).

The class should not specify a MARK value. Any MARK value given is
ignored with a warning. The RATE and CEIL parameters apply to each
instance of the class. So the total RATE represented by an entry with
occurs will be the listed RATE
multiplied by number.

The above defines 6 classes with class IDs 0x101-0x106. Each class
has a guaranteed rate of 1kbit/second and a ceiling of 230kbit.

/etc/shoreall/mangle or
/etc/shoreall/tcrules:

#ACTION SOURCE DEST
IPMARK(src,0xff,0x10100):F 192.168.1.0/29 eth0

This facility also alters the way in which Shorewall generates a
class number when none is given. Prior to the implementation of this
facility, the class number was constructed by concatinating the MARK
value with the either '1' or '10'. '10' was used when there were more
than 10 devices defined in
/etc/shorewall/tcdevices.

With this facility, a new method is added; class numbers are
assigned sequentially beginning with 2. The WIDE_TC_MARKS option in
shorewall.conf selects which construction to use.
WIDE_TC_MARKS=No (the default) produces pre-Shorewall 4.4 behavior.
WIDE_TC_MARKS=Yes (TC_BITS >= 14 in Shorewall 4.4.26 and later)
produces the new behavior.

Real life examples

A Shorewall User's Experience

Configuration to replace Wondershaper

You are able to fully replace the wondershaper script by using
the buitin traffic control.. In this example it is assumed that your
interface for your Internet connection is ppp0 (for DSL), if you use
another connection type, you have to change it. You also need to
change the settings in the tcdevices.wondershaper file to reflect your
line speed. The relevant lines of the config files follow here. Please
note that this is just a 1:1 replacement doing exactly what
wondershaper should do. You are free to change it...

Wondershaper allows you to define a set of hosts and/or ports
you want to classify as low priority. To achieve this , you have to
add these hosts to tcrules and set the mark to 3 (true if you use
the example configuration files).

Setting hosts to low priority

lets assume the following settings from your old wondershaper
script (don't assume these example values are really useful, they
are only used for demonstrating ;-):

A simple setup

This is a simple setup for people sharing an Internet connection
and using different computers for this. It just basically shapes
between 2 hosts which have the ip addresses 192.168.2.23 and
192.168.2.42

tcdevices file

#INTERFACE IN_BANDWITH OUT_BANDWIDTH
ppp0 6000kbit 700kbit

We have 6mbit down and 700kbit upstream.

tcclasses file

We add a class for tcp ack packets with highest priority, so
that downloads are fast. The following 2 classes share most of the
bandwidth between the 2 hosts, if the connection is idle, they may
use full speed. As the hosts should be treated equally they have the
same priority. The last class is for the remaining traffic.

We mark icmp ping and replies so they will go into the fast
interactive class and set a mark for each host.

A Warning to Xen Users

If you are running traffic shaping in your dom0 and traffic shaping
doesn't seem to be limiting outgoing traffic properly, it may be due to
"checksum offloading" in your domU(s). Check the output of "shorewall show
tc". Here's an excerpt from the output of that command:

This problem will be corrected by disabling "checksum offloading" in
your domU(s) using the ethtool utility. See the one of the Xen articles for
instructions.

An HFSC Example

As mentioned at the top of this article, there is an excellent
introduction to HFSC at http://linux-ip.net/articles/hfsc.en/.
At the end of that article are 'tc' commands that implement the
configuration in the article. Those tc commands correspond to the
following Shorewall traffic shaping configuration.

Where Did all of those Magic Numbers come from?

As you read the article, numbers seem to be introduced out of thin
air. I'll try to shed some light on those.

There is very clear development of these numbers:

12ms to transfer a 1500b packet at 1000kbits/second.

100kbits per second with 1500b packets, requires 8 packets per
second.

A packet from class 1:12 must be sent every 120ms.

Total transmit delay can be no more than 132ms (120 +
12).

We then learn that the queuing latency can be reduced to 30ms if
we use a two-part service curve whose first part is 400kbits/second.
Where did those come from?

The latency is calculated from the rate. If it takes 12ms to
transmit a 1500 byte packet at 1000kbits/second, it takes 30ms to
transmit a 1500b at 400kbits/second.

For the slope of the first part of the service curve, in
theory we can pick any number between 100 (the rate of class 1:12)
and 500 (the rate of the parent class) with higher numbers providing
lower latency.

The final curious number is the latency for class 1:11 - 52.5ms.
It is a consequence of everything that has gone before.

To acheive 400kbits/second with 1500-byte packets, 33.33 packets
per second are required. So a packet from class 1:11 must be sent every
30 ms. As the article says, "...the maximum transmission delay of this
class increases from 30ms to a total of 52.5 ms.". So we are looking for
an additional 22.5 ms.

Assume that both class 1:11 and 1:12 transmit for 30 ms at
400kbits/second. That is a total of 800kbits/second for 30ms. So Class
1:11 is punished for the excess. How long is the punishment? The two
classes sent 24,000 bits in 30ms; they are only allowed 0.030 * 500,000
= 15,000. So they are 9,000 bits over their quota. The amount of time
required to transmit 9,000 bits at 400,000 bits/second is
22.5ms!.

Intermediate Functional Block (IFB) Devices

The principles behind an IFB is fairly simple:

It looks like a network interface although it is never given an
IPv4 configuration.

Because it is a network interface, queuing disciplines can be
associated with an IFB.

The magic of an IFB comes in the fact that a filter may be defined
on a real network interface such that each packet that arrives on that
interface is queued for the IFB! In that way, the IFB provides a means for
shaping input traffic.

To use an IFB, you must have IFB support in your kernel
(configuration option CONFIG_IFB). Assuming that you have a modular
kernel, the name of the IFB module is 'ifb' and may be loaded using the
command modprobe ifb (if you have modprobe installed)
or insmod /path/to/module/ifb.

By default, two IFB devices (ifb0 and ifb1) are created. You can
control that using the numifbs option (e.g., modprobe ifb
numifbs=1).

To create a single IFB when Shorewall starts, place the following
two commands in /etc/shorewall/init:

modprobe ifb numifbs=1
ip link set ifb0 up

Entries in /etc/shorewall/mangle or
/etc/shorewall/tcrules have no effect on shaping
traffic through an IFB. To allow classification of such traffic, the
/etc/shorewall/tcfilters file has been added. Entries in that file create
u32 classification
rules.

/etc/shorewall/tcfilters

While this file was created to allow shaping of traffic through an
IFB, the file may be used for general traffic classification as well.
The file is similar to shorewall-mangle(5) with the
following key exceptions:

The first match determines the classification, whereas in the
mangle file, the last match determines the classification.

ipsets are not supported

DNS Names are not supported

Address ranges and lists are not supported

Exclusion is not supported.

filters are applied to packets as they appear on the
wire. So incoming packets will not have DNAT applied yet
(the destination IP address will be the external address) and
outgoing packets will have had SNAT applied.

The last point warrants elaboration. When looking at traffic being
shaped by an IFB, there are two cases to consider:

Requests — packets being sent from remote clients to local
servers. These packets may undergo subsequent DNAT, either as a
result of entries in /etc/shorewall/nat or as a
result of DNAT or REDIRECT rules.

Requests redirected by this rule will have destination IP
address 206.124.146.177 and destination port 80.

Responses — packets being sent from remote servers to local
clients. These packets may undergo subsequent DNAT as a result of
entries in /etc/shorewall/nat or in
/etc/shorewall/masq. The packet's destination
IP address will be the external address specified in the
entry.

Example:
/etc/shorewall/masq:

#INTERFACE SOURCE ADDRESS
eth0 192.168.1.0/24 206.124.146.179

When running Shorewall 5.0.14 or later, the equivalent
/etc/shorewall/snat would be:

#ACTION SOURCE DEST ...
SNAT(206.124.146.179) 192.168.1.0/24 eth0

HTTP response packets corresponding to requests that fall
under that rule will have destination IP address 206.124.146.179 and
source port 80.

Beginning with Shorewall 4.4.15, both IPv4 and IPv6 rules can be
defined in this file. See shorewall-tcfilters (5)
for details.

Columns in the file are as follow. As in all Shorewall
configuration files, a hyphen ("-") may be used to indicate that no
value is supplied in the column.

CLASS

The interface name or number followed by a colon (":") and
the class number.

SOURCE

SOURCE IP address (host or network). DNS names are not
allowed.

DEST

DESTINATION IP address (host or network). DNS names are not
allowed.

PROTO

Protocol name or number.

DPORT

Comma-separated list of destination port names or numbers.
May only be specified if the protocol is TCP, UDP, SCTP or ICMP.
Port ranges are supported except for ICMP.

SPORT

Comma-separated list of source port names or numbers. May
only be specified if the protocol is TCP, UDP or SCTP. Port ranges
are supported.

TOS

Specifies the value of the TOS field. The value can be any
of the following:

tos-minimize-delay

tos-maximuze-throughput

tos-maximize-reliability

tos-minimize-cost

tos-normal-service

hex-number

hex-number/hex-number

The hex-numbers must be exactly
two digits (e.g., 0x04).

LENGTH

Must be a power of 2 between 32 and 8192 inclusive. Packets
with a total length that is strictly less than the specified value
will match the rule.

Example:

I've used this configuration on my own firewall. The IFB portion
is more for test purposes rather than to serve any well-reasoned QOS
strategy.

Optionally supply an /etc/shorewall/tcclear script to stop
traffic shaping. That is usually unnecessary.

If your tcstart script uses the “fwmark”
classifier, you can mark packets using entries in
/etc/shorewall/mangle or /etc/shorewall/tcrules.

Traffic control outside Shorewall

To start traffic shaping when you bring up your network
interfaces, you will have to arrange for your traffic shaping
configuration script to be run at that time. How you do that is
distribution dependent and will not be covered here. You then
should:

Set TC_ENABLED=No and CLEAR_TC=No

If your script uses the “fwmark” classifier, you
can mark packets using entries in /etc/shorewall/mangle or
/etc/shorewall/tcrules.