Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Netfilter’s connection tracking system

1.
F I LT E R I N G P O L I C I E S B A S E D U N I Q U E LY on packet header information are obsolete. PA B L O N E I R A AY U S O These days, stateful firewalls provide advanced mechanisms to let sysadmins Netfilter’s and security experts define more intelli- gent policies. This article describes the connection implementation details of the connection tracking system provided by the Netfilter tracking system project and also presents the required Pablo Neira Ayuso has an M.S. in computer science background to understand it, such as an and has worked for several companies in the IT secu- rity industry, with a focus on open source solutions. understanding of the Netfilter framework. Nowadays he is a full-time teacher and researcher at the University of Seville. This article will be the perfect complement to understanding the subsystem that pneira@lsi.us.es enables the stateful firewall available in any recent Linux kernel. The Netfilter Framework The Netﬁlter project was founded by Paul “Rusty” Russell during the 2.3.x development series. At that time the existing ﬁrewalling tool for Linux had serious drawbacks that required a full rewrite. Rusty decided to start from scratch and create the Netﬁlter framework, which comprises a set of hooks over the Linux network protocol stack. With the hooks, you can register kernel modules that do some kind of network packet handling at different stages. Iptables, the popular ﬁrewalling tool for Linux, is commonly confused with the Netﬁlter framework itself. This is because iptables chains and hooks have the same names. But iptables is just a brick on top of the Netﬁlter framework. Fortunately, Rusty spent considerable time writ- ing documentation [1] that comes in handy for anyone willing to understand the framework, al- though at some point you will surely feel the need to get your hands dirty and look at the code to go further. T H E H O O K S A N D T H E C A L L B AC K F U N C T I O N S Netﬁlter inserts ﬁve hooks (Fig. 1) into the Linux networking stack to perform packet handling at different stages; these are the following: I PREROUTING: All the packets, with no exceptions, hit this hook, which is reached before the routing decision and after all the IP header sanity checks are fulﬁlled. Port Address Translation (NAPT) and Redirec-34 ;LOGI N: VOL. 31, NO. 3

2.
tions, that is, Destination Network Translation (DNAT), are imple- mented in this hook. I LOCAL INPUT: All the packets going to the local machine reach this hook. This is the last hook in the incoming path for the local machine trafﬁc. I FORWARD: Packets not going to the local machine (e.g., packets going through the ﬁrewall) reach this hook. I LOCAL OUTPUT: This is the ﬁrst hook in the outgoing packet path.NF_IP_PRE_ROUTING NF_IP_FORWARD NF_IP_POST_ROUTING Packets leaving the local machine always hit this hook. 1 Route 3 4 I POSTROUTING: This hook is implemented after the routing decision. Source Network Address Translation (SNAT) is registered to this hook. Route All the packets that leave the local machine reach this hook. 2 Local Process 5 Therefore we can model three kind of trafﬁc ﬂows, depending on the NF IP LOCAL IN NF IP LOCAL OUT destination: I Trafﬁc going through the ﬁrewall, in other words, trafﬁc not going to F I G U R E 1 : N E T F I LT E R H O O K S the local machine. Such trafﬁc follows the path: PREROUTING FOR- WARD POSTROUTING. I Incoming trafﬁc to the ﬁrewall, for example, trafﬁc for the local machine. Such trafﬁc follows the path: PREROUTING INPUT. I Outgoing trafﬁc from the ﬁrewall: OUTPUT POSTROUTING. One can register a callback function to a given hook. The prototype of the callback function is deﬁned in the structure nf_hook_ops in netﬁlter.h. This structure contains the information about the hook to which the callback will be registered, together with the priority. Since you can register more than one callback to a given hook, the priority indicates which callback is issued ﬁrst. The register operation is done via the function nf_register_hook(...). The callbacks can return several different values that will be interpreted by the framework in the following ways: I ACCEPT: Lets the packet keep traveling through the stack. I DROP: Silently discards the packet. I QUEUE: Passes the packet to userspace via the nf_queue facility. Thus a userspace program will do the packet handling for us. I STOLEN: Silently holds the packet until something happens, so that it temporarily does not continue to travel through the stack. This is usu- ally used to collect defragmented IP packets. I REPEAT: Forces the packet to reenter the hook. In short, the framework provides a method for registering a callback func- tion that does some kind of packet handling at any of the stages previously detailed. The return value issued will be taken by the framework that will apply the policy based on this verdict. If at this point you consider the information provided here to be insufﬁ- cient and need more background about the Linux network stack, then con- sult the available documentation [2] about packet travel through the Linux network stack. The Connection Tracking System and the Stateful Inspection The days when packet ﬁltering policies were based uniquely on the packet header information, such as the IP source, destination, and ports, are over. Over the years, this approach has been demonstrated to be insufﬁcient pro- tection against probes and denial-of-service attacks. ; LO G I N : J U N E 2 0 0 6 N E T F I LT E R ’ S CO N N E C T I O N T R A C K I N G SY ST E M 35

3.
Fortunately, nowadays sysadmins can offer few excuses for not performing stateful ﬁltering in their ﬁrewalls. There are open source implementations available that can be used in production environments. In the case of Linux, this feature was added during the birth of the Netﬁlter project. Connection tracking is another brick built on top of the Netﬁlter frame- work. Basically, the connection tracking system stores information about the state of a connection in a memory structure that contains the source and desti- nation IP addresses, port number pairs, protocol types, state, and timeout. With this extra information, we can deﬁne more intelligent ﬁltering poli- cies. Moreover, there are some application protocols, such as FTP TFTP IRC, , , and PPTP that have aspects that are hard to track for a ﬁrewall that follows , the traditional static ﬁltering approach. The connection tracking system deﬁnes a mechanism to track such aspects, as will be described below. The connection tracking system does not ﬁlter the packets themselves; the default behavior always lets the packets continue their travel through the network stack, although there are a couple of very speciﬁc exceptions where packets can be dropped (e.g., under memory exhaustion). So keep in mind that the connection tracking system just tracks packets; it does not ﬁlter. STAT E S The possible states deﬁned for a connection are the following: I NEW: The connection is starting. This state is reached if the packet is valid, that is, if it belongs to the valid sequence of initialization (e.g., in a TCP connection, a SYN packet is received), and if the ﬁrewall has only seen trafﬁc in one direction (i.e., the ﬁrewall has not yet seen any reply packet). I ESTABLISHED: The connection has been established. In other words, this state is reached when the ﬁrewall has seen two-way communica- tion. I RELATED: This is an expected connection. This state is further described below, in the section “Helpers and Expectations.” I INVALID: This is a special state used for packets that do not follow the expected behavior of a connection. Optionally, the sysadmin can deﬁne rules in iptables to log and drop this packet. As stated previ- ously, connection tracking does not ﬁlter packets but, rather, provides a way to ﬁlter them. As you have surely noticed already, by following the approach described, even stateless protocols such as UDP are stateful. And, of course, these states have nothing to do with the TCP states. THE BIG PICTURE This article focuses mainly in the layer-3 independent connection track- ing system implementation nf_conntrack, based on the IPv4 dependent ip_conn_track, which has been available since Linux kernel 2.6.15. Support for speciﬁc aspects of IPv4 and IPv6 are implemented in the modules nf_conntrack_ipv4 and nf_conntrack_ipv6, respectively. Layer-4 protocol support is also implemented in separated modules. Currently, there is built-in support for TCP UDP ICMP and optionally for , , ,36 ;LOGI N: VOL. 31, NO. 3

4.
SCTP These protocol handlers track the concrete aspects of a given layer-4 . protocol to ensure that connections evolve correctly and that nothing evil happens. The module nf_conntrack_ipv4 registers four callback functions (Fig. 1) in several hooks. These callbacks live in the ﬁle nf_conntrack_core.c and take as parameter the layer-3 protocol family, so basically they are the same for IPv6. The callbacks can be grouped into three families: the conntrack cre- ation and lookup, the defragmented packets, and the helpers. The module nf_conntrack_ipv6 will not be further described in this document, since it is similar to the IPv4 variant. I M P L E M E N TAT I O N I S S U E S B A S I C ST R U C T U R E The connection tracking system is an optional modular loadable subsystem, although it is always required by the NAT subsystem. It is implemented with a hash table (Fig. 2) to perform efﬁcient lookups. Each bucket has a double- linked list of hash tuples. There are two hash tuples for every connection: one for the original direction (i.e., packets coming from the point that started the connection) and one for the reply direction (i.e., reply packetsFIGURE 2: CONNECTION TRACKING going to the point that started the connection).STRUCTURE A tuple represents the relevant information of a connection, IP source and IP destination, as well as layer-4 protocol information. Such tuples are embed- ded in a hash tuple. Both structures are deﬁned in nf_conntrack_tuple.h. The two hash tuples are embedded in the structure nf_conn, from this point onward referred to as conntrack, which is the structure that stores the state of a given connection. Therefore, a conntrack is the container of two hash tuples, and every hash tuple is the container of a tuple. This results in three layers of embedded structures. A hash function is used to calculate the position where the hash tuple that represents the connection is supposed to be. This calculation takes as input parameters the relevant layer-3 and layer-4 protocol information. Currently, the function used is Jenkins’ hash [3]. The hash calculation is augmented with a random seed to avoid the poten- tial performance drop should some malicious user hash-bomb a given hash chain, since this can result in a very long chain of hash tuples. However, the conntrack table has a limited maximum number of conntracks; if it ﬁlls up, the evicted conntrack will be the least recently used of a hash chain. The size of the conntrack table is tunable on module load or, alter- natively, at kernel boot time. T H E CO N N T R A C K C R E AT I O N A N D LO O K U P P R O C E S S The callback nf_conntrack_in is registered in the PREROUTING hook. Some sanity checks are done at this stage to ensure that the packet is correct. Afterward, checks take place during the conntrack lookup process. The sub- system tries to look up a conntrack that matches with the packet received. If no conntrack is found, it will be created. This mechanism is implemented in the function resolve_normal_ct. If the packet belongs to a new connection, the conntrack just created will; LO G I N : J U N E 2 0 0 6 N E T F I LT E R ’ S CO N N E C T I O N T R A C K I N G SY ST E M 37

5.
have the ﬂag conﬁrmed unset. The ﬂag conﬁrmed is set if such a conntrack is already in the hash table. This means that at this point no new conn- tracks are inserted. Such an insertion will happen once the packet leaves the framework successfully (i.e., when it arrives at the last hook without being dropped). The association between a packet and a conntrack is established by means of a pointer. If the pointer is null, then the packet belongs to an invalid connection. Iptables also allows us to untrack some connections. For that purpose, a dummy conntrack is used. In conclusion, the callback nf_conntrack_conﬁrm is registered in the LOCAL INPUT and POSTROUTING hooks. As you have already noticed, these are the last hooks in the exit path for the local and forwarded trafﬁc, respectively. The conﬁrmation process happens at this point: The conn- track is inserted in the hash table, the conﬁrmed ﬂag is set, and the associ- ated timer is activated. D E F R A G M E N T E D PA C K E T H A N D L I N G This work is done by the callback ipv4_conntrack_defrag, which gathers the defragmented packets. Once they are successfully received, the fragments continue their travel through the stack. In the 2.4 kernel branch, the defragmented packets are linearized, that is, they are copied into contiguous memory. However, an optimization was introduced in kernel branch 2.6 to reduce the impact of this extra handling cost: The fragments are no longer copied into a linear space; instead, they are gathered and put in a list. Thus all handling must be fragment-aware. For example, if we need some information stored in the TCP packet head- er, we must ﬁrst check whether the header is fragmented; if it is, then just the required information is copied to the stack. This is not actually a problem since there are available easy-to-use functions, such as skb_head- er_pointer, that are fragment-aware and can linearize just the portion of data required in case the packet is defragmented. Otherwise, header-check- ing does not incur any handling penalty. H E L P E R S A N D E X P E C TAT I O N S Some application-layer protocols have certain aspects that are difﬁcult to track. For example, the File Transfer Protocol (FTP) passive mode uses port 21 for control operations to request some data from the server, but it uses TCP ports between 1024 and 65535 to receive the data requested instead of using the classical TCP port 20. This means that these two independentF I G U R E 3 : R E L AT I O N S H I P B E T W E E N connections are inherently related. Therefore, the ﬁrewall requires extraA CONNTRACK AND AN information to ﬁlter this kind of protocol successfully.E X P E C TAT I O N The connection tracking system deﬁnes a mechanism called helpers that lets the system identify whether a connection is related to an existing one. To do so, it deﬁnes the concept of expectation. An expectation is a connec- tion that is expected to happen in a period of time. It is deﬁned as an nf_conntrack_expect structure in the nf_conntrack_core.h ﬁle. The helper searches a set of patterns in the packets that contain the aspect that is hard to track. In the case of FTP the helper looks for the PORT pat- , tern that is sent in reply to the request to begin a passive mode connection (i.e., the PASV method). If the pattern is found, an expectation is created and is inserted in the global list of expectations (Fig. 3). Thus, the helper deﬁnes a proﬁle of possible connections that will be expected.38 ;LOGI N: VOL. 31, NO. 3

6.
An expectation has a limited lifetime. If a conntrack is created, the connec- tion tracking system searches for matching expectations. If no matching can be found, it will look for a helper for this connection. When the system ﬁnds a matching expectation, the new conntrack is relat- ed to the master conntrack that created such an expectation. For instance, in the case of the FTP passive mode, the conntrack that represents the traf- ﬁc going to port 21 (control trafﬁc) is the master conntrack, and the conn- track that represents the data trafﬁc (e.g., trafﬁc going to a high port) is related to the conntrack that represents the control trafﬁc. A helper is registered via nf_contrack_helper_register, which adds a struc- ture nf_conntrack_helper to a list of helpers. Conclusions and Future Work Netﬁlter’s connection tracking system is not a piece of software stuck in time. There is considerable interesting work in progress targeted at improv- ing the existing implementation. It is worth mentioning that during the 4th Netﬁlter Workshop [4], some work addressing replacing the current hash table approach with a tree of hash tables [5] was presented. The pre- liminary performance tests look promising. Fortunately, the subsystem described in this document is accessible not only from the kernel side. There exists a userspace library called libnetﬁlter_conntrack that provides a programming interface (API) to the in- kernel connection tracking state table. With regards to the helpers, support for Internet telephony protocols such as H.323 and VoIP are on the way. In addition, there is also some work in progress on providing the appropriate mechanisms to allow people to implement their own protocol helpers in userspace, a feature that Rusty dreamed of in the early days of the Netﬁlter Project. A C K N O W L E D G M E N TS I would like to thank Harald Welte and Patrick McHardy for spending their precious time reviewing my contributions, as well as many others. Thanks are also owed to my Ph.D. director, Rafael M. Gasca (University of Seville, Spain), and to Laurent Lefevre and the RESO/LIP laboratory (ENS Lyon, France) for the student research period of February to July 2004. REFERENCES [1] Paul Russel and Harald Welte, “Netﬁlter Hacking How-to”: http://www .netﬁlter.org/documentation/HOWTO/netﬁlter-hacking-HOWTO.txt. [2] Miguel Rio et al., “A Map of the Networking Code in Linux Kernel 2.4.20,” Technical Report DataTAG-2004-1, FP5/IST DataTAG Project, 2004. [3] Bob Jenkins, “A Hash Function for Hash Table Lookup”: http://burtleburtle.net/bob/hash/doobs.html. [4] 4th Netﬁlter Workshop, October 2005: http://workshop.netﬁlter.org/2005/. [5] Martin Josefsson, “Hashtrie: An Early Experiment,” October 2005: http://workshop.netﬁlter.org/2005/presentations/martin.sxi.; LO G I N : J U N E 2 0 0 6 N E T F I LT E R ’ S CO N N E C T I O N T R A C K I N G SY ST E M 39