Permission is granted to copy, distribute and/or modify this document under the
terms of the GNU Free Documentation License, Version 1.1; with the Invariant Sections
being "Introduction" and all sub-sections, with the Front-Cover Texts being "Original
Author: Oskar Andreasson", and with no Back-Cover Texts. A copy of the license is
included in the section entitled "GNU Free Documentation License".

All scripts in this tutorial are covered by the GNU General Public License. The
scripts are free source; you can redistribute them and/or modify them under the
terms of the GNU General Public License as published by the Free Software Foundation,
version 2 of the License.

These scripts are distributed in the hope that they will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License within this tutorial,
under the section entitled "GNU General Public License"; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

Chapter 7. The state machine

This chapter will deal with the state machine and explain it in
detail. After reading through it, you should have a complete
understanding of
how the State machine works. We will also go through a large set of
examples on
how states are dealt with within the state machine itself. These should
clarify
everything in practice.

Introduction

The state machine is a special part within iptables that should
really not
be called the state machine at all, since it is really a connection
tracking
machine. However, most people recognize it under the first name.
Throughout
this chapter I will use these names more or less as if they were
synonymous.
This should not be overly confusing. Connection tracking is done to let
the
Netfilter framework know the state of a specific connection. Firewalls
that
implement this are generally called stateful firewalls. A stateful
firewall is
generally much more secure than non-stateful firewalls since it allows
us to
write much tighter rule-sets.

Within iptables, packets can be related to tracked connections
in four different so called states. These are known as
NEW, ESTABLISHED,
RELATED and INVALID. We will discuss
each of these in more depth later. With the --state
match we can easily control who or what is allowed to initiate new
sessions.

All of the connection tracking is done by special framework within
the kernel
called conntrack. conntrack may be loaded either as a module, or as an
internal part of the kernel itself. Most of the time, we need and want
more
specific connection tracking than the default conntrack engine can
maintain.
Because of this, there are also more specific parts of conntrack that
handles
the TCP, UDP or
ICMP protocols among others. These modules grab
specific, unique, information from the packets, so that they may keep
track of
each stream of data. The information that conntrack gathers is then
used to
tell conntrack in which state the stream is currently in. For example,
UDP streams are, generally, uniquely identified by
their destination IP address, source IP
address, destination port and
source port.

In previous kernels, we had the possibility to turn on and off
defragmentation. However, since iptables and Netfilter were introduced
and connection tracking in particular, this option was gotten rid of.
The
reason for this is that connection tracking can not work properly
without
defragmenting packets, and hence defragmenting has been incorporated
into
conntrack and is carried out automatically. It can not be turned off,
except by turning off connection tracking. Defragmentation is always
carried
out if connection tracking is turned on.

All connection tracking is handled in the PREROUTING
chain, except locally generated packets which are handled in the
OUTPUT chain. What this means is that iptables will
do all recalculation of states and so on within the
PREROUTING chain. If we send the initial packet in a
stream, the state gets set to NEW within the
OUTPUT chain, and when we receive a return packet,
the state gets changed in the PREROUTING chain to
ESTABLISHED, and so on. If the first packet is not
originated by ourself, the NEW state is set within the
PREROUTING chain of course. So, all state changes and
calculations are done within the PREROUTING and
OUTPUT chains of the nat table.

The conntrack entries

Let's take a brief look at a conntrack entry and how to read them
in
/proc/net/ip_conntrack. This gives a list of
all the
current entries in your conntrack database. If you have the
ip_conntrack module loaded, a cat
of /proc/net/ip_conntrack might look like:

This example contains all the information that the conntrack module
maintains to know which state a specific connection is in. First of
all, we
have a protocol, which in this case is tcp. Next, the same value
in normal decimal coding. After this, we see how long this conntrack
entry has
to live. This value is set to 117 seconds right now and is decremented
regularly until we see more traffic. This value is then reset to the
default
value for the specific state that it is in at that relevant point of
time.
Next comes the actual state that this entry is in at the present point
of
time. In the above mentioned case we are looking at a packet that is in
the
SYN_SENT state. The internal value of a
connection is slightly different from the ones used externally with
iptables. The
value SYN_SENT tells us that we are looking
at a connection that has only seen a TCP SYN packet
in one direction. Next, we see the source IP
address, destination IP address,
source port and destination
port. At this point we see a specific keyword that tells us that
we have seen no return traffic for this connection. Lastly, we
see what we expect of return packets. The information details the
source IP address and destination IP
address (which are both inverted, since the packet is
to be directed back to us). The same thing goes for the
source port and destination
port of the connection. These are the values that should be of
any interest to us.

The connection tracking entries may take on a series of different
values,
all specified in the conntrack headers available in
linux/include/netfilter-ipv4/ip_conntrack*.h
files.
These values are dependent on which sub-protocol of
IP we use. TCP,
UDP or ICMP protocols
take specific default values as specified in
linux/include/netfilter-ipv4/ip_conntrack.h.
We will
look closer at this when we look at each of the protocols; however, we
will not use them extensively through this chapter, since they are not
used
outside of the conntrack internals. Also, depending on how this state
changes, the default value of the time until the connection is
destroyed
will also change.

Recently there was a new patch made available in iptables
patch-o-matic,
called tcp-window-tracking. This patch adds, among other things, all of
the above timeouts to special sysctl variables, which means that they
can
be changed on the fly, while the system is still running. Hence, this
makes it unnecessary to recompile the kernel every time you want to
change
the timeouts.

These can be altered via using specific system calls
available in the /proc/sys/net/ipv4/netfilter
directory.
You should
in particular look at the /proc/sys/net/ipv4/netfilter/ip_ct_*
variables.

When a connection has seen traffic in both directions, the
conntrack entry
will erase the [UNREPLIED] flag, and then
reset it. The entry that tells us that the connection has not seen any
traffic in both directions, will be replaced by the
[ASSURED] flag, to be found close to the end
of the entry. The [ASSURED] flag tells us
that this connection is assured and that it will not be erased if we
reach the
maximum possible tracked connections. Thus, connections marked as
[ASSURED] will not be erased, contrary to
the non-assured connections (those not marked as
[ASSURED]). How many connections that the
connection tracking table can hold depends upon a variable that can be
set
through the ip-sysctl functions in recent kernels. The default value
held by
this entry varies heavily depending on how much memory you have. On 128
MB of
RAM you will get 8192 possible entries, and at 256 MB of RAM, you will
get
16376 entries. You can read and set your settings through the
/proc/sys/net/ipv4/ip_conntrack_max setting.

A different way of doing this, that is more efficient, is to set the
hashsize option to the ip_conntrack module
once this is loaded. Under normal circumstances ip_conntrack_max equals
8 * hashsize. In other words, setting the hashsize to 4096 will result
in ip_conntrack_max being set to 32768 conntrack entries. An example of
this would be:

User-land states

As you have seen, packets may take on several different states
within the
kernel itself, depending on what protocol we are talking about.
However,
outside the kernel, we only have the 4 states as described previously.
These
states can mainly be used in conjunction with the state match which
will then
be able to match packets based on their current connection tracking
state. The
valid states are NEW,
ESTABLISHED, RELATED and
INVALID. The following table will briefly explain
each possible state.

Table 7-1. User-land states

State

Explanation

NEW

The NEW state tells us that the packet is
the first packet that we see. This means that the first packet that the
conntrack module sees, within a specific connection, will be matched.
For
example, if we see a SYN packet and it is the first
packet in a connection that we see, it will match. However, the packet
may as
well not be a SYN packet and still be considered
NEW. This may lead to certain problems in some instances,
but it may also be extremely helpful when we need to pick up lost
connections
from other firewalls, or when a connection has already timed out, but
in
reality is not closed.

ESTABLISHED

The ESTABLISHED state has seen traffic in both
directions and will then continuously match those packets.
ESTABLISHED connections are fairly easy to understand. The
only requirement to get into an ESTABLISHED state is that
one host sends a packet, and that it later on gets a reply from the
other
host. The NEW state will upon receipt of the reply packet
to or through the firewall change to the ESTABLISHED state.
ICMP reply messages can also be
considered as ESTABLISHED, if we created a packet
that in turn generated the reply ICMP message.

RELATED

The RELATED state is one of the more tricky
states. A connection is considered RELATED when it is
related to another already ESTABLISHED connection. What
this means, is that for a connection to be considered as
RELATED, we must first have a connection that is considered
ESTABLISHED. The ESTABLISHED connection
will then spawn a connection outside of the main connection. The newly
spawned
connection will then be considered RELATED, if the
conntrack module is able to understand that it is RELATED.
Some good examples of connections that can be considered as
RELATED are the FTP-data
connections that are considered RELATED to the
FTP control port, and the
DCC connections issued through
IRC. This could be used to allow
ICMP error messages, FTP
transfers and DCC's to work properly through the
firewall. Do note that most TCP protocols and some
UDP protocols that rely on this mechanism are quite
complex and send connection information within the payload of the
TCP or UDP data segments,
and hence require special helper modules to be correctly understood.

INVALID

The INVALID state means that the packet can't be identified
or that it does not have any state. This may be due to
several reasons, such as the system running out of memory or
ICMP error messages that do not respond to any known
connections. Generally, it is a good idea to DROP
everything in this state.

UNTRACKED

This is the UNTRACKED state. In brief, if a packet is marked
within the raw table with the NOTRACK target, then that packet will
show up as UNTRACKED in the state machine. This also means that all
RELATED connections will not be seen, so some caution must be taken
when dealing with the UNTRACKED connections since the state machine
will not be able to see related ICMP messages et cetera.

These states can be used together with the --state
match to match packets based on their connection tracking state. This
is what
makes the state machine so incredibly strong and efficient for our
firewall.
Previously, we often had to open up all ports above 1024 to let all
traffic
back into our local networks again. With the state machine in place
this is
not necessary any longer, since we can now just open up the firewall
for
return traffic and not for all kinds of other traffic.

TCP connections

In this section and the upcoming ones, we will take a closer look at
the
states and how they are handled for each of the three basic protocols
TCP, UDP and
ICMP. Also, we will take a closer look at how
connections are handled per default, if they can not be classified as
either
of these three protocols. We have chosen to start out with the
TCP protocol since it is a stateful protocol in
itself, and has a lot of interesting details with regard to the state
machine
in iptables.

A TCP connection is always initiated with the 3-way
handshake, which establishes and negotiates the actual connection over
which
data will be sent. The whole session is begun with a
SYN packet, then a SYN/ACK
packet and finally an ACK packet to acknowledge the
whole session establishment. At this point the connection is
established and
able to start sending data. The big problem is, how does connection
tracking
hook up into this? Quite simply really.

As far as the user is concerned, connection tracking works basically
the
same for all connection types. Have a look at the picture
below to see exactly what state the stream enters during the different
stages
of the connection. As you can see, the connection tracking code does
not
really follow the flow of the TCP connection, from
the users viewpoint. Once it has seen one packet(the
SYN), it considers the connection as NEW. Once it
sees the return packet(SYN/ACK), it considers the
connection as ESTABLISHED. If you think about this a
second, you will understand why. With this particular implementation,
you can
allow NEW and ESTABLISHED packets to
leave your local network, only allow ESTABLISHED
connections back, and that will work perfectly. Conversely, if the
connection
tracking machine were to consider the whole connection establishment as
NEW, we would never really be able to stop outside
connections to our local network, since we would have to allow
NEW packets back in again. To make things more complicated,
there are a number of other internal states that are used for
TCP connections inside the kernel, but which are not
available for us in User-land. Roughly, they follow the state standards
specified within RFC 793 - Transmission Control
Protocol on pages 21-23. We will consider these in more detail
further along in this section.

As you can see, it is really quite simple, seen from the user's
point of view.
However, looking at the whole construction from the kernel's point of
view,
it's a little more difficult. Let's look at an example. Consider
exactly how
the connection states change in the
/proc/net/ip_conntrack table. The first state
is reported
upon receipt of the first SYN packet in a connection.

As you can see from the above entry, we have a precise state in
which a SYN
packet has been sent, (the SYN_SENT
flag is set), and to which as yet no reply has been sent (witness the
[UNREPLIED] flag). The next internal state
will be reached when we see another packet in the other direction.

Now we have received a corresponding SYN/ACK in
return. As soon as this packet has been received, the state changes
once
again, this time to SYN_RECV.
SYN_RECV tells us that the original
SYN was delivered correctly and that the
SYN/ACK return packet also got through the firewall
properly. Moreover, this connection tracking entry has now seen traffic
in
both directions and is hence considered as having been replied to. This
is not
explicit, but rather assumed, as was the
[UNREPLIED] flag above. The final
step will be reached once we have seen the final ACK
in the 3-way handshake.

In the last example, we have gotten the final ACK in
the 3-way handshake and the connection has entered the
ESTABLISHED state, as far as the internal mechanisms of
iptables are aware. Normally, the stream will be ASSURED by now.

A connection may also enter the ESTABLISHED state, but not
be[ASSURED]. This happens if we have
connection pickup turned on (Requires the tcp-window-tracking patch,
and the
ip_conntrack_tcp_loose to be set to 1 or
higher). The default, without the tcp-window-tracking patch, is to have
this
behaviour, and is not changeable.

When a TCP connection is closed down, it is done in
the following way and takes the following states.

As you can see, the connection is never really closed until the last
ACK is sent. Do note that this picture only describes
how it is closed down under normal circumstances. A connection may
also, for
example, be closed by sending a RST(reset), if
the connection were to be refused. In this case, the connection would
be
closed down immediately.

When the TCP connection has been closed down, the
connection enters the TIME_WAIT state, which
is per default set to 2 minutes. This is used so that all packets that
have
gotten out of order can still get through our rule-set, even after the
connection has already closed. This is used as a kind of buffer time so
that
packets that have gotten stuck in one or another congested router can
still
get to the firewall, or to the other end of the connection.

If the connection is reset by a RST packet,
the state is changed to CLOSE. This
means that the connection per default has 10 seconds before the whole
connection is definitely closed down. RST packets are
not acknowledged in any sense, and will break the connection directly.
There
are also other states than the ones we have told you about so far. Here
is the
complete list of possible states that a TCP stream
may take, and their timeout values.

Table 7-2. Internal states

State

Timeout value

NONE

30 minutes

ESTABLISHED

5 days

SYN_SENT

2 minutes

SYN_RECV

60 seconds

FIN_WAIT

2 minutes

TIME_WAIT

2 minutes

CLOSE

10 seconds

CLOSE_WAIT

12 hours

LAST_ACK

30 seconds

LISTEN

2 minutes

These values are most definitely not absolute. They may change with
kernel
revisions, and they may also be changed via the proc file-system in the
/proc/sys/net/ipv4/netfilter/ip_ct_tcp_*
variables. The
default values should, however, be fairly well established in practice.
These
values are set in seconds. Early versions of the patch used jiffies
(which was a bug).

Also note that the User-land side of the state machine does
not look at TCP flags (i.e., RST, ACK, and SYN are flags) set in the
TCP packets. This is generally bad, since you may want to allow packets
in the NEW state to get through the firewall, but when you specify the
NEW flag, you will in most cases mean SYN packets.

This is not what happens with the current state
implementation; instead, even a
packet with no bit set or an ACK flag, will count as
NEW.
This can be used for redundant firewalling and so on, but it is
generally
extremely bad on your home network, where you only have a single
firewall. To
get around this behavior, you could use the command explained in the State NEW packets but no SYN bit set
section of the Common problems and
questions appendix.
Another way is to install the tcp-window-tracking extension
from patch-o-matic, and set the /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_loose
to zero, which will make the firewall drop all NEW packets with
anything but the SYN flag set.

UDP connections

UDP connections are in themselves not stateful
connections, but rather stateless. There are several reasons why,
mainly
because they don't contain any connection establishment or connection
closing; most of all they lack sequencing. Receiving two
UDP datagrams in a specific order does not say
anything about the order in which they were sent. It is, however,
still possible to set states on the connections within the kernel.
Let's have
a look at how a connection can be tracked and how it might look in
conntrack.

As you can see, the connection is brought up almost exactly in the
same way as a TCP connection. That is, from the
user-land point of view. Internally, conntrack information looks quite
a bit
different, but intrinsically the details are the same. First of all,
let's
have a look at the entry after the initial UDP packet
has been sent.

As you can see from the first and second values, this is an
UDP packet. The
first is the protocol name, and the second is protocol number. This is
just
the same as for TCP connections. The third value
marks how many seconds this state entry has to live. After this, we get
the
values of the packet that we have seen and the future expectations of
packets
over this connection reaching us from the initiating packet sender.
These are
the source, destination, source port and destination port. At this
point, the [UNREPLIED] flag tells us that
there's so far been no response to the packet. Finally, we get a brief
list of
the expectations for returning packets. Do note that the latter entries
are
in reverse order to the first values. The timeout at this
point is set to 30 seconds, as per default.

At this point the server has seen a reply to the first packet sent
out and the
connection is now considered as ESTABLISHED. This is not
shown in the connection tracking, as you can see. The main difference
is that
the [UNREPLIED] flag has now gone. Moreover,
the default timeout has changed to 180 seconds - but in this example
that's
by now been decremented to 170 seconds - in 10 seconds' time, it will
be 160
seconds. There's one thing that's missing, though, and can change a
bit, and
that is the [ASSURED] flag described above.
For the [ASSURED] flag to be set on a tracked
connection, there must have been a legitimate reply packet to the NEW
packet.

At this point, the connection has become assured. The connection
looks
exactly the same as the previous example. If this connection is not
used for 180 seconds, it times out. 180 Seconds is a comparatively low
value,
but should be sufficient for most use. This value is reset to its full
value
for each packet that matches the same entry and passes through the
firewall,
just the same as for all of the internal states.

ICMP connections

ICMP packets are far from a stateful stream, since
they are only used for controlling and should never establish any
connections.
There are four ICMP types that will generate return
packets however, and these have 2 different states. These
ICMP messages can take the NEW and
ESTABLISHED states. The ICMP types
we are talking about are Echo request and
reply, Timestamp request and
reply, Information request
and reply and finally Address mask
request and reply. Out of these, the
timestamp request and information
request are obsolete and could most probably just be dropped.
However, the Echo messages are used in several setups
such as pinging hosts. Address mask requests are not
used often, but could be useful at times and worth allowing. To get
an idea of how this could look, have a look at the following image.

As you can see in the above picture, the host sends an echo
request to the target, which is considered as
NEW by the firewall. The target then responds with a
echo reply which the firewall considers as state
ESTABLISHED. When the first echo request has been seen, the
following state entry goes into the ip_conntrack.

This entry looks a little bit different from the standard states for
TCP and UDP as you can see.
The protocol is there, and the timeout, as well as source and
destination
addresses. The problem comes after that however. We now have 3 new
fields
called type,
code and id.
They are not special in any way, the type
field contains the ICMP type and the
code field contains the
ICMP code. These are all available in ICMP
types appendix. The final
id field, contains the ICMP
ID. Each ICMP packet gets an ID set to
it when it is sent, and when the receiver gets the
ICMP message, it sets the same
ID within the new ICMP
message so that the sender will recognize the reply and will be able to
connect it with the correct ICMP request.

The next field, we once again recognize as the
[UNREPLIED] flag, which we have seen before.
Just as before, this flag tells us that we are currently looking at a
connection tracking entry that has seen only traffic in one direction.
Finally, we see the reply expectation for the reply
ICMP packet, which is the inversion of the original
source and destination IP addresses. As for the type and code, these
are
changed to the correct values for the return packet, so an echo request
is
changed to echo reply and so on. The ICMP ID is
preserved from the request packet.

The reply packet is considered as being ESTABLISHED, as we
have already explained. However, we can know for sure that after the
ICMP reply, there will be absolutely no more legal
traffic in the same connection. For this reason, the connection
tracking entry
is destroyed once the reply has traveled all the way through the
Netfilter
structure.

In each of the above cases, the request is considered as
NEW, while the reply is considered as
ESTABLISHED. Let's consider this more closely. When the
firewall sees a request packet, it considers it as NEW.
When the host sends a reply packet to the request it is considered
ESTABLISHED.

Note that this means that the reply packet must match the
criterion given by
the connection tracking entry to be considered as established, just as
with
all other traffic types.

ICMP requests has a default timeout of 30 seconds, which you can
change in the
/proc/sys/net/ipv4/netfilter/ip_ct_icmp_timeout
entry.
This should in general be a good timeout value, since it will be able
to catch
most packets in transit.

Another hugely important part of ICMP is the fact
that it is used to tell the hosts what happened to specific
UDP and TCP connections or
connection attempts. For this simple reason, ICMP replies will very
often be
recognized as RELATED to original connections or
connection attempts. A simple example would be the
ICMP Host unreachable or ICMP Network
unreachable. These should always be spawned back to our host if
it attempts an unsuccessful connection to some other host, but the
network or
host in question could be down, and hence the last router trying to
reach the
site in question will reply with an ICMP message
telling us about it. In this case, the ICMP reply is
considered as a RELATED packet. The following picture
should explain how it would look.

In the above example, we send out a SYN packet to
a specific address. This is considered as a NEW connection
by the firewall. However, the network the packet is trying to reach is
unreachable, so a router returns a network unreachable
ICMP error to us. The connection tracking code can
recognize this packet as RELATED. thanks to the already
added tracking entry, so the ICMP reply is correctly
sent to the client which will then hopefully abort. Meanwhile, the
firewall
has destroyed the connection tracking entry since it knows this was an
error
message.

The same behavior as above is experienced with UDP
connections if they run into any problem like the above. All
ICMP messages sent in reply to
UDP connections are considered as
RELATED. Consider the following image.

This time an UDP packet is sent to the host. This
UDP connection is considered as
NEW. However, the network is administratively prohibited by
some firewall or router on the way over. Hence, our firewall receives a
ICMP Network Prohibited in return. The firewall knows
that this ICMP error message is related to the
already opened UDP connection and sends it as a
RELATED packet to the client. At this point, the firewall
destroys the connection tracking entry, and the client receives the
ICMP message and should hopefully abort.

Default connections

In certain cases, the conntrack machine does not know how to handle
a specific
protocol. This happens if it does not know about that protocol in
particular,
or doesn't know how it works. In these cases, it goes back to a default
behavior. The default behavior is used on, for example,
NETBLT, MUX and
EGP. This behavior looks pretty much the
same as the UDP connection tracking. The first packet
is considered NEW, and reply traffic and so forth is
considered ESTABLISHED.

When the default behavior is used, all of these packets will attain
the same
default timeout value. This can be set via the
/proc/sys/net/ipv4/netfilter/ip_ct_generic_timeout
variable. The default value here is 600 seconds, or 10 minutes.
Depending on
what traffic you are trying to send over a link that uses the default
connection tracking behavior, this might need changing. Especially if
you are
bouncing traffic through satellites and such, which can take a long
time.

Untracked connections and the raw table

UNTRACKED is a rather special keyword when it comes to connection
tracking in Linux. Basically, it is used to match packets that has been
marked in the raw table not to be tracked.

The raw table was created specifically for this reason. In this
table, you set a NOTRACK mark on packets that you do not wish to track
in netfilter.

Notice how I say packets, not connection, since the mark is
actually set for each and every packet that enters. Otherwise, we would
still have to do some kind of tracking of the connection to know that
it should not be tracked.

As we have already stated in this chapter, conntrack and the state
machine is rather resource hungry. For this reason, it might sometimes
be a good idea to turn off connection tracking and the state machine.

One example would be if you have a heavily trafficked router that
you want to firewall the incoming and outgoing traffic on, but not the
routed traffic. You could then set the NOTRACK mark on all packets not
destined for the firewall itself by ACCEPT'ing all packets with
destination your host in the raw table, and then set the NOTRACK for
all other traffic. This would then allow you to have stateful matching
on incoming traffic for the router itself, but at the same time save
processing power from not handling all the crossing traffic.

Another example when NOTRACK can be used is if you have a highly
trafficked webserver and want to do stateful tracking, but don't want
to waste processing power on tracking the web traffic. You could then
set up a rule that turns of tracking for port 80 on all the locally
owned IP addresses, or the ones that are actually serving web traffic.
You could then enjoy statefull tracking on all other services, except
for webtraffic which might save some processing power on an already
overloaded system.

There is however some problems with NOTRACK that you must take into
consideration. If a whole connection is set with NOTRACK, then you will
not be able to track related connections either, conntrack and nat
helpers will simply not work for
untracked connections, nor will related ICMP errors do. You will have
to open up for these manually in other words. When it comes to complex
protocols such as FTP and SCTP et cetera, this can be very hard to
manage. As long as you are aware of this, you should be able to handle
this however.

Complex protocols and connection tracking

Certain protocols are more complex than others. What this means
when it comes to connection tracking, is that such protocols may be
harder
to track correctly. Good examples of these are the
ICQ, IRC and
FTP protocols. Each and every one of these
protocols carries information within the actual data payload of the
packets, and hence requires special connection tracking helpers to
enable
it to function correctly.

This is a list of the complex protocols that has support inside the
linux kernel, and which kernel version it was introduced in.

Table 7-3. Complex protocols support

Protocol name

Kernel versions

FTP

2.3

IRC

2.3

TFTP

2.5

Amanda

2.5

FTP

IRC

TFTP

Let's take the FTP protocol as the
first example. The FTP protocol first opens up a
single connection that is called the FTP control
session. When we issue commands through this session, other ports are
opened to carry the rest of the data related to that specific command.
These connections can be done in two ways, either actively or
passively.
When a connection is done actively, the FTP
client sends the server a port and IP address to connect to. After
this,
the FTP client opens up the port and the server
connects to that specified port from a random unprivileged port
(>1024) and sends the data over it.

The problem here is that the firewall will not know about these
extra connections, since they were negotiated within the actual payload
of the protocol data. Because of this, the firewall will be unable to
know
that it should let the server connect to the client over these specific
ports.

The solution to this problem is to add a special helper to the
connection tracking module which will scan through the data in the
control
connection for specific syntaxes and information. When it runs into the
correct information, it will add that specific information as
RELATED and the server will be able to track the
connection, thanks to that RELATED entry. Consider the
following picture to understand the states when the
FTP server has made the connection back to the
client.

Passive FTP works the opposite way. The
FTP client tells the server that it wants some
specific data, upon which the server replies with an IP address to
connect to
and at what port. The client will, upon receipt of this data, connect
to that
specific port, from its own port 20(the FTP-data port), and get the
data in
question. If you have an FTP server behind your
firewall, you will in other words require this module in addition to
your
standard iptables modules to let clients on the Internet connect to the
FTP server properly. The same goes if you are
extremely restrictive to your users, and only want to let them reach
HTTP and FTP servers on the
Internet and block all other ports. Consider the following image and
its
bearing on Passive FTP.

Some conntrack helpers are already available within the kernel
itself. More specifically, the FTP and
IRC protocols have conntrack helpers as of
writing this. If you can not find the conntrack helpers that you need
within the kernel itself, you should have a look at the
patch-o-matic tree within user-land iptables. The
patch-o-matic tree may contain more conntrack
helpers, such as for the ntalk or
H.323 protocols. If they are not available in the
patch-o-matic tree, you have a number of options. Either you can look
at
the CVS source of iptables, if it has recently gone into that tree, or
you
can contact the Netfilter-devel
mailing list and ask if it is available. If it is not, and there are no
plans for adding it, you are left to your own devices and would most
probably want to read the Rusty
Russell's Unreliable Netfilter Hacking HOW-TO which is linked from
the Other resources and links
appendix.

Conntrack helpers may either be statically compiled into the kernel,
or as
modules. If they are compiled as modules, you can load them with the
following
command

Do note that connection tracking has nothing to do with
NAT, and hence you may require more modules if you
are NAT'ing connections as well. For example, if you were to want to
NAT and track FTP
connections, you would need the NAT module as well.
All NAT helpers starts with ip_nat_ and follow that
naming convention; so for example the FTP
NAT helper would be named
ip_nat_ftp and the IRC
module would be named ip_nat_irc. The conntrack
helpers follow the same naming convention, and hence the
IRC conntrack helper would be named
ip_conntrack_irc, while the
FTP conntrack helper would be named
ip_conntrack_ftp.

What's next?

This chapter has discussed how the state machine in netfilter works
and how it
keeps state of different connections. The chapter has also discussed
how it is
represented toward you, the end user and what you can do to alter its
behavior, as well as different protocols that are more complex to do
connection tracking on, and how the different conntrack helpers come
into the
picture.

The next chapter will discuss how to save and restore rulesets using
the
iptables-save and iptables-restore
programs distributed with the iptables applications. This
has both pros and cons, and the chapter will discuss it in detail.