The Linux Socket Filter: Sniffing Bytes over the Network

A feature added to the kernel with the 2.2 release, this LSF can be programmed to let the kernel decide to which packets access should be granted. Here's how.

If you deal with network administration
or security management, or if you are merely curious about what is
passing by over your local network, grabbing some packets off the
network card can be a useful exercise. With a little bit of C
coding and a basic knowledge of networking, you will be able to
capture data even if it is not addressed to your machine. In this
article, we will refer to Ethernet networks, by far the most
widespread LAN technology. Also, for reasons that will be explained
later, we will assume that source and destination hosts belong to
the same LAN.

First off, we will briefly recall how a common Ethernet
network card works. Those of you who are already skilled in this
field may safely skip to the next paragraph. IP packets sourced
from users' applications are encapsulated into Ethernet frames
(this is the name given to packets when sent over an Ethernet
segment), which are just bigger lower-level packets containing the
original IP packet and some information needed to carry it to its
destination (see Figure 1). In particular, the destination IP
address is mapped to a 6-byte destination Ethernet address (often
called MAC address) through a mechanism called ARP. Thus, the frame
containing the packet travels from the source host to the
destination host over the cable that connects them. It is likely
that the frame will go through network devices such as hubs and
switches, but since we assumed no LAN borders are crossed, no
routers or gateways will be involved.

Figure 1. IP Packets as Ethernet Frames

No routing process happens at the Ethernet level. In other
words, the frame sent by the source host will not be headed
directly toward the destination host; instead, the frame will be
copied over all the cables that make up the LAN, and all the
network cards will see it passing (see Figure 2). Each network card
will start reading the first six bytes of the frame (which happen
to contain the above-mentioned destination MAC addresses), but only
one card will recognize its own address in the destination field
and will pick up the frame. At this point, the frame will be taken
apart by the network driver and the original IP packet will be
recovered and passed up to the receiving application through the
network protocol stack.

Figure 2. Sending Ethernet Frames over the LAN

More precisely, the network driver will have a look at the
Protocol Type field inside the Ethernet frame header (see Figure 1)
and, based on that value, forward the packet to the appropriate
protocol receiving function. Most of the time the protocol will be
IP, and the receiving function will take off the IP header and pass
the payload up to the UDP- or TCP-receiving functions. These
protocols, in turn, will pass it to the socket-handling functions,
which will eventually deliver packet data to the receiving
application in userland. During this trip, the packet loses all
network information related to it, such as the source addresses (IP
and MAC) and port, IP options, TCP parameters and so on.
Furthermore, if the destination host does not have an open socket
with the correct parameters, the packet will be discarded and never
make it to the application level.

As a consequence, we have two distinct issues in sniffing
packets over the network. One is related to Ethernet addressing—we
cannot read packets that are not destined to our host; the other is
related to protocol stack processing—in order for the packet not
to be discarded, we should have a listening socket for each and
every port. Furthermore, part of the packet information is lost
during protocol stack processing.

The first issue is not fundamental, since we may not be
interested in other hosts' packets and may tend to sniff all the
packets directed to our machine. The second one, however, must be
solved. We will see how to address these issues separately,
starting with the latter.

The PF_PACKET Protocol

When you open a socket with the standard call sock =
socket(domain, type, protocol) you have to specify which
domain (or protocol family) you are going to use with that socket.
Commonly used families are PF_UNIX, for communications bounded on
the local machine, and PF_INET, for communications based on IPv4
protocols. Furthermore, you have to specify a type for your socket
and possible values depend on the family you specified. Common
values for type, when dealing with the PF_INET family, include
SOCK_STREAM (typically associated with TCP) and SOCK_DGRAM
(associated with UDP). Socket types influence how packets are
handled by the kernel before being passed up to the application.
Finally, you specify the protocol that will handle the packets
flowing through the socket (more details on this can be found on
the socket(3) man page).

In recent versions of the Linux kernel (post-2.0 releases) a
new protocol family has been introduced, named PF_PACKET. This
family allows an application to send and receive packets dealing
directly with the network card driver, thus avoiding the usual
protocol stack-handling (e.g., IP/TCP or IP/UDP processing). That
is, any packet sent through the socket will be directly passed to
the Ethernet interface, and any packet received through the
interface will be directly passed to the application.

The PF_PACKET family supports two slightly different socket
types, SOCK_DGRAM and SOCK_RAW. The former leaves to the kernel the
burden of adding and removing Ethernet level headers. The latter
gives the application complete control over the Ethernet header.
The protocol field in the socket() call must match one of the
Ethernet IDs defined in /usr/include/linux/if_ether.h, which
represents the registered protocols that can be shipped in an
Ethernet frame. Unless dealing with very specific protocols, you
typically use ETH_P_IP, which encompasses all of the IP-suite
protocols (e.g., TCP, UDP, ICMP, raw IP and so on).

Since they have pretty serious security implications (for
example, you may forge a frame with a spoofed MAC address),
PF_PACKET-family sockets may only be used by root.

The PF_PACKET family easily solves the problem associated
with protocol stack-handling of our sniffed packets. Let's see it
do so with the example in Listing 1. We open a socket belonging to
the PF_PACKET family, specifying a SOCK_RAW socket type and
IP-related protocol type. Then we start reading from the socket
and, after a few sanity checks, we print out some information
extracted from the Ethernet level and IP level headers. By
cross-checking the printed addresses with the offsets in Figure 1,
you will see how easy it is for the application to get access to
network level data.

Assuming that your machine is connected to an Ethernet LAN,
you can experiment with our short example by running it while
generating packets directed to your host from another machine (you
can ping or Telnet to your host).
You will be able to see all the packets directed to you, but you
will not see any packet headed toward other hosts.

Comment viewing options

nice article. i like the simplicity of it. However, I am wondering whether this technique can be used to create firewalls ? can i discard packets based on the criteria that I choose. Libpcap won't help because it creates a copy of the packet so the packet does reach where it is intended.

Excelent article!. I'm writting from Venezuela, and I wanted to know how do I sniff packets without using the PF_PACKET family. I ask you this because I need to do that without root permissions. Thanks

I also found video tutorials on sniffing at www.security-freak.net . started by a Vivek Ramachandran, they are quite elaborate in coverage and literally spoon feed topics like sniffing, packet injection etc.

Great article!
I'd just like to point out that you should not use ioctl() for setting the promiscuous mode. If you do, you're responsible for disabling the promiscuous mode after you're done. Unfortunately, you have no way of knowing if another socket also requested the promiscuous mode while your code was running. Thus, resetting the Ethernet flags to the original value could mess things up.

Instead, you should use setsockopt() with SOL_PACKET, PACKET_ADD_MEMBERSHIP and have PACKET_MR_PROMISC as the argument. This way the kernel will track the promiscuous mode usage and turn it off automatically.

thanks much for this informative article on a poorly documented subject. this tied together a lot of the bits and pieces i've been sifting through. i'd advise anyone seeking to learn more about creating your own filters to keep this article + source in one hand and the Van Jacobsen/McCanne paper in the other. - britney_spears@hotpop.com

ACtually i m bit new with the socket programming stuff..Actually wat i want to read the bytes from the socket using read () subroutine (I am using Fedora Envionment)but when i exc the program; it stops at the same position where i defined the read sub routine and does not giving me anything..could u plz comment on this..thx in advance.