The kernel mechanisms for handling
network interfaces reside primarily in the ifnet,
if_data, ifaddr, and ifmultiaddr structures in
<net/if.h> and <net/if_var.h> and the
functions named above and defined in /sys/net/if.c. Those
interfaces which are intended to be used by user programs are
defined in <net/if.h>; these include the interface
flags, the if_data structure, and the structures defining
the appearance of interface-related messages on the route(4)
routing socket and in sysctl(3). The header file
<net/if_var.h> defines the kernel-internal interfaces,
including the ifnet, ifaddr, and ifmultiaddr
structures and the functions which manipulate them. (A few user
programs will need <net/if_var.h> because it is the
prerequisite of some other header file like
<netinet/if_ether.h>. Most references to those two
files in particular can be replaced by
<net/ethernet.h>.)

The system keeps a
linked list of interfaces using the TAILQ macros defined in
queue(3); this list is headed by a struct ifnethead called
ifnet. The elements of this list are of type struct
ifnet, and most kernel routines which manipulate interface as
such accept or return pointers to these structures. Each interface
structure contains an if_data structure, which contains
statistics and identifying information used by management programs,
and which is exported to user programs by way of the ifmib(4)
branch of the sysctl(3) MIB. Each interface also has a TAILQ of
interface addresses, described by ifaddr structures; the
head of the queue is always an AF_LINK address (see link_addr(3))
describing the link layer implemented by the interface (if any).
(Some trivial interfaces do not provide any link layer addresses;
this structure, while still present, serves only to identify the
interface name and index.)

Finally, those
interfaces supporting reception of multicast datagrams have a TAILQ
of multicast group memberships, described by ifmultiaddr
structures. These memberships are reference-counted.

Interfaces are also
associated with an output queue, defined as a struct
ifqueue; this structure is used to hold packets while the
interface is in the process of sending another.

Theifnetstructure
The fields of struct ifnet are as follows:

if_softc

(void *) A
pointer to the driver’s private state block. (Initialized by
driver.)

if_l2com

(void *) A
pointer to the common data for the interface’s layer 2
protocol. (Initialized by if_alloc().)

if_link

(TAILQ_ENTRY(ifnet)) queue(3) macro glue.

if_xname

(char *) The
name of the interface, (e.g., ‘‘fxp0’’ or
‘‘lo0’’). (Initialized by driver.)

if_dname

(const char *)
The name of the driver. (Initialized by driver.)

if_dunit

(int) A unique
number assigned to each interface managed by a particular driver.
Drivers may choose to set this to IF_DUNIT_NONE if a unit number is
not associated with the device. (Initialized by driver.)

if_addrhead

(struct
ifaddrhead) The head of the queue(3) TAILQ containing the list
of addresses assigned to this interface.

if_pcount

(int) A count of
promiscuous listeners on this interface, used to reference-count
the IFF_PROMISC flag.

(u_short) A
unique number assigned to each interface in sequence as it is
attached. This number can be used in a struct sockaddr_dl to
refer to a particular interface by index (see link_addr(3)).
(Initialized by if_alloc().)

if_timer

(short) Number
of seconds until the watchdog timer if_watchdog() is called,
or zero if the timer is disabled. (Set by driver, decremented by
generic watchdog code.)

(void *) A
pointer to an interface-specific MIB structure exported by
ifmib(4). (Initialized by driver.)

if_linkmiblen

(size_t) The
size of said structure. (Initialized by driver.)

if_data

(struct if_data)
More statistics and information; see The if_data structure,
below. (Initialized by driver, manipulated by both driver and
generic code.)

if_snd

(struct ifqueue)
The output queue. (Manipulated by driver.)

There are in addition a
number of function pointers which the driver must initialize to
complete its interface with the generic interface layer:

if_input()

Pass a packet to an appropriate upper
layer as determined from the link-layer header of the packet. This
routine is to be called from an interrupt handler or used to
emulate reception of a packet on this interface. A single function
implementing if_input() can be shared among multiple drivers
utilizing the same link-layer framing, e.g., Ethernet.

if_output()

Output a packet on interface
ifp, or queue it on the output queue if the interface is
already active.

if_start()

Start queued output on an interface.
This function is exposed in order to provide for some interface
classes to share a if_output() among all drivers.
if_start() may only be called when the IFF_OACTIVE flag is
not set. (Thus, IFF_OACTIVE does not literally mean that output is
active, but rather that the device’s internal output queue is
full.)

if_done()

Not used. We are not even sure what it
was ever for. The prototype is faked.

if_ioctl()

Process interface-related ioctl(2)
requests (defined in <sys/sockio.h>). Preliminary
processing is done by the generic routine ifioctl() to check
for appropriate privileges, locate the interface being manipulated,
and perform certain generic operations like twiddling flags and
flushing queues. See the description of ifioctl() below for
more information.

if_watchdog()

Routine called by the generic code when
the watchdog timer, if_timer, expires. Usually this will
reset the interface.

if_init()

Initialize and bring up the hardware,
e.g., reset the chip and the watchdog timer and enable the receiver
unit. Should mark the interface running, but not active
(IFF_RUNNING, ~IIF_OACTIVE).

if_resolvemulti()

Check the requested multicast group
membership, addr, for validity, and if necessary compute a
link-layer group which corresponds to that address which is
returned in *retsa. Returns zero on success, or an error
code on failure.

Interface
Flags
Interface flags are used for a number of different purposes. Some
flags simply indicate information about the type of interface and
its capabilities; others are dynamically manipulated to reflect the
current state of the interface. Flags of the former kind are marked
〈S〉 in this table; the latter are marked
〈D〉.

The macro IFF_CANTCHANGE
defines the bits which cannot be set by a user program using the
SIOCSIFFLAGS command to ioctl(2); these are indicated by an
asterisk (‘*’) in the following listing.

IFF_UP

〈D〉 The
interface has been configured up by the user-level code.

IFF_BROADCAST

〈S*〉 The
interface supports broadcast.

IFF_DEBUG

〈D〉 Used to
enable/disable driver debugging code.

IFF_LOOPBACK

〈S〉 The
interface is a loopback device.

IFF_POINTOPOINT

〈S*〉 The
interface is point-to-point; ‘‘broadcast’’
address is actually the address of the other end.

IFF_RUNNING

〈D*〉 The
interface has been configured and dynamic resources were
successfully allocated. Probably only useful internal to the
interface.

IFF_NOARP

〈D〉 Disable
network address resolution on this interface.

IFF_PROMISC

〈D*〉 This
interface is in promiscuous mode.

IFF_PPROMISC

〈D〉 This
interface is in the permanently promiscuous mode (implies
IFF_PROMISC).

IFF_ALLMULTI

〈D*〉 This
interface is in all-multicasts mode (used by multicast
routers).

IFF_OACTIVE

〈D*〉 The
interface’s hardware output queue (if any) is full; output
packets are to be queued.

IFF_SIMPLEX

〈S*〉 The
interface cannot hear its own transmissions.

IFF_LINK0
IFF_LINK1
IFF_LINK2

〈D〉 Control
flags for the link layer. (Currently abused to select among
multiple physical layers on some devices.)

IFF_MULTICAST

〈S*〉 This
interface supports multicast.

IFF_POLLING

〈D*〉 The
interface is in polling(4) mode. See Interface Capabilities
Flags for details.

Interface
Capabilities Flags
Interface capabilities are specialized features an interface may or
may not support. These capabilities are very hardware-specific and
allow, when enabled, to offload specific network processing to the
interface or to offer a particular feature for use by other kernel
parts.

It should be stressed
that a capability can be completely uncontrolled (i.e., stay always
enabled with no way to disable it) or allow limited control over
itself (e.g., depend on another capability’s state.) Such
peculiarities are determined solely by the hardware and driver of a
particular interface. Only the driver possesses the knowledge on
whether and how the interface capabilities can be controlled.
Consequently, capabilities flags in if_capenable should
never be modified directly by kernel code other than the interface
driver. The command SIOCSIFCAP to ifioctl() is the dedicated
means to attempt altering if_capenable on an interface.
Userland code shall use ioctl(2).

The following
capabilities are currently supported by the system:

IFCAP_NETCONS

This interface can be a
network console.

IFCAP_POLLING

This interface supports
polling(4). See below for details.

IFCAP_RXCSUM

This interface can do
checksum validation on receiving data. Some interfaces do not have
sufficient buffer storage to store frames above a certain MTU-size
completely. The driver for the interface might disable hardware
checksum validation if the MTU is set above the hardcoded
limit.

IFCAP_TXCSUM

This interface can do
checksum calculation on transmitting data.

IFCAP_HWCSUM

A shorthand for
(IFCAP_RXCSUM | IFCAP_TXCSUM).

IFCAP_VLAN_HWTAGGING

This interface can do
VLAN tagging on output and demultiplex frames by their VLAN tag on
input.

IFCAP_VLAN_MTU

The vlan(4) driver can
operate over this interface in software tagging mode without having
to decrease MTU on vlan(4) interfaces below 1500 bytes. This
implies the ability of this interface to cope with frames somewhat
longer than permitted by the Ethernet specification.

IFCAP_JUMBO_MTU

This Ethernet interface
can transmit and receive frames up to 9000 bytes long.

The ability of advanced
network interfaces to offload certain computational tasks from the
host CPU to the board is limited mostly to TCP/IP. Therefore a
separate field associated with an interface (see
ifnet.if_data.ifi_hwassist below) keeps a detailed
description of its enabled capabilities specific to TCP/IP
processing. The TCP/IP module consults the field to see which tasks
can be done on an outgoing packet by the interface. The
flags defined for that field are a superset of those for
mbuf.m_pkthdr.csum_flags, namely:

CSUM_IP

The interface will
compute IP checksums.

CSUM_TCP

The interface will
compute TCP checksums.

CSUM_UDP

The interface will
compute UDP checksums.

CSUM_IP_FRAGS

The interface can
compute a TCP or UDP checksum for a packet fragmented by the host
CPU. Makes sense only along with CSUM_TCP or CSUM_UDP.

CSUM_FRAGMENT

The interface will do
the fragmentation of IP packets if necessary. The host CPU does not
need to care about MTU on this interface as long as a packet to
transmit through it is an IP one and it does not exceed the size of
the hardware buffer.

An interface notifies
the TCP/IP module about the tasks the former has performed on an
incoming packet by setting the corresponding flags in the
field mbuf.m_pkthdr.csum_flags of the mbuf chain
containing the packet. See mbuf(9) for details.

The capability of a
network interface to operate in polling(4) mode involves several
flags in different global variables and per-interface fields.
First, there is a system-wide sysctl(8) master switch named
kern.polling.enable, which can toggle polling(4) globally.
If that variable is set to non-zero, polling(4) will be used on
those devices where it is enabled individually. Otherwise,
polling(4) will not be used in the system. Second, the capability
flag IFCAP_POLLING set in interface’s if_capabilities
indicates support for polling(4) on the particular interface. If
set in if_capabilities, the same flag can be marked or
cleared in the interface’s if_capenable, thus
initiating switch of the interface to polling(4) mode or interrupt
mode, respectively. The actual mode change will occur at an
implementation-specific moment in the future, e.g., during the next
interrupt or polling(4) cycle. And finally, if the mode transition
has been successful, the flag IFF_POLLING is marked or cleared in
the interface’s if_flags to indicate the current mode
of the interface.

Theif_dataStructure
In 4.4BSD, a subset of the interface information believed to be of
interest to management stations was segregated from the
ifnet structure and moved into its own if_data
structure to facilitate its use by user programs. The following
elements of the if_data structure are initialized by the
interface and are not expected to change significantly over the
course of normal operation:

ifi_type

(u_char) The
type of the interface, as defined in <net/if_types.h>
and described below in the Interface Types section.

ifi_physical

(u_char)
Intended to represent a selection of physical layers on devices
which support more than one; never implemented.

ifi_addrlen

(u_char) Length
of a link-layer address on this device, or zero if there are none.
Used to initialized the address length field in sockaddr_dl
structures referring to this interface.

ifi_hdrlen

(u_char) Maximum
length of any link-layer header which might be prepended by the
driver to a packet before transmission. The generic code computes
the maximum over all interfaces and uses that value to influence
the placement of data in mbufs to attempt to ensure that
there is always sufficient space to prepend a link-layer header
without allocating an additional mbuf.

ifi_datalen

(u_char) Length
of the if_data structure. Allows some stabilization of the
routing socket ABI in the face of increases in the length of
struct ifdata.

ifi_mtu

(u_long) The
maximum transmission unit of the medium, exclusive of any
link-layer overhead.

ifi_metric

(u_long) A
dimensionless metric interpreted by a user-mode routing
process.

ifi_baudrate

(u_long) The
line rate of the interface, in bits per second.

ifi_hwassist

(u_long) A
detailed interpretation of the capabilities to offload
computational tasks for outgoing packets. The interface
driver must keep this field in accord with the current value of
if_capenable.

ifi_epoch

(time_t) The
system uptime when interface was attached or the statistics below
were reset. This is intended to be used to set the SNMP variable
ifCounterDiscontinuityTime. It may also be used to determine
if two successive queries for an interface of the same index have
returned results for the same interface.

The structure
additionally contains generic statistics applicable to a variety of
different interface types (except as noted, all members are of type
u_long):

ifi_link_state

(u_char) The
current link state of Ethernet interfaces. See the Interface
Link States section for possible values.

ifi_ipackets

Number of packets
received.

ifi_ierrors

Number of receive
errors detected (e.g., FCS errors, DMA overruns, etc.). More
detailed breakdowns can often be had by way of a link-specific
MIB.

ifi_opackets

Number of packets
transmitted.

ifi_oerrors

Number of output errors
detected (e.g., late collisions, DMA overruns, etc.). More detailed
breakdowns can often be had by way of a link-specific MIB.

ifi_collisions

Total number of
collisions detected on output for CSMA interfaces. (This member is
sometimes [ab]used by other types of interfaces for other output
error counts.)

ifi_ibytes

Total traffic received,
in bytes.

ifi_obytes

Total traffic
transmitted, in bytes.

ifi_imcasts

Number of packets
received which were sent by link-layer multicast.

ifi_omcasts

Number of packets sent
by link-layer multicast.

ifi_iqdrops

Number of packets
dropped on input. Rarely implemented.

ifi_noproto

Number of packets
received for unknown network-layer protocol.

ifi_lastchange

(struct timeval)
The time of the last administrative change to the interface (as
required for SNMP).

Interface
Types
The header file <net/if_types.h> defines symbolic
constants for a number of different types of interfaces. The most
common are:

IFT_OTHER

none of the
following

IFT_ETHER

Ethernet

IFT_ISO88023

ISO 8802-3 CSMA/CD

IFT_ISO88024

ISO 8802-4 Token
Bus

IFT_ISO88025

ISO 8802-5 Token
Ring

IFT_ISO88026

ISO 8802-6 DQDB MAN

IFT_FDDI

FDDI

IFT_PPP

Internet Point-to-Point
Protocol (ppp(8))

IFT_LOOP

The loopback (lo(4))
interface

IFT_SLIP

Serial Line IP

IFT_PARA

Parallel-port IP
(‘‘PLIP’’)

IFT_ATM

Asynchronous Transfer
Mode

Interface Link
States
The following link states are currently defined:

LINK_STATE_UNKNOWN

The link is in an
invalid or unknown state.

LINK_STATE_DOWN

The link is down.

LINK_STATE_UP

The link is up.

TheifaddrStructure
Every interface is associated with a list (or, rather, a TAILQ) of
addresses, rooted at the interface structure’s
if_addrlist member. The first element in this list is always
an AF_LINK address representing the interface itself; multi-access
network drivers should complete this structure by filling in their
link-layer addresses after calling if_attach(). Other
members of the structure represent network-layer addresses which
have been configured by means of the SIOCAIFADDR command to
ioctl(2), called on a socket of the appropriate protocol family.
The elements of this list consist of ifaddr structures. Most
protocols will declare their own protocol-specific interface
address structures, but all begin with a struct ifaddr which
provides the most-commonly-needed functionality across all
protocols. Interface addresses are reference-counted.

The members of struct
ifaddr are as follows:

ifa_addr

(struct sockaddr
*) The local address of the interface.

ifa_dstaddr

(struct sockaddr
*) The remote address of point-to-point interfaces, and the
broadcast address of broadcast interfaces. (ifa_broadaddr is
a macro for ifa_dstaddr.)

ifa_netmask

(struct sockaddr
*) The network mask for multi-access interfaces, and the
confusion generator for point-to-point interfaces.

ifa_ifp

(struct ifnet *)
A link back to the interface structure.

ifa_link

(TAILQ_ENTRY(ifaddr)) queue(3) glue for list of
addresses on each interface.

ifa_rtrequest

See below.

ifa_flags

(u_short) Some
of the flags which would be used for a route representing this
address in the route table.

ifa_refcnt

(short) The
reference count.

ifa_metric

(int) A metric
associated with this interface address, for the use of some
external routing protocol.

References to
ifaddr structures are gained manually, by incrementing the
ifa_refcnt member. References are released by calling either
the ifafree() function or the IFAFREE() macro.

ifa_rtrequest()
is a pointer to a function which receives callouts from the routing
code (rtrequest()) to perform link-layer-specific actions
upon requests to add, resolve, or delete routes. The cmd
argument indicates the request in question: RTM_ADD, RTM_RESOLVE,
or RTM_DELETE. The rt argument is the route in question; the
dst argument is the specific destination being manipulated
for RTM_RESOLVE, or a null pointer otherwise.

FUNCTIONS

The functions provided by the generic
interface code can be divided into two groups: those which
manipulate interfaces, and those which manipulate interface
addresses. In addition to these functions, there may also be
link-layer support routines which are used by a number of drivers
implementing a specific link layer over different hardware; see the
documentation for that link layer for more details.

TheifmultiaddrStructure
Every multicast-capable interface is associated with a list of
multicast group memberships, which indicate at a low level which
link-layer multicast addresses (if any) should be accepted, and at
a high level, in which network-layer multicast groups a user
process has expressed interest.

The elements of the
structure are as follows:

ifma_link

(LIST_ENTRY(ifmultiaddr)) queue(3) macro glue.

ifma_addr

(struct sockaddr
*) A pointer to the address which this record represents. The
memberships for various address families are stored in arbitrary
order.

ifma_lladdr

(struct sockaddr
*) A pointer to the link-layer multicast address, if any, to
which the network-layer multicast address in ifma_addr is
mapped, else a null pointer. If this element is non-nil, this
membership also holds an invisible reference to another membership
for that link-layer address.

ifma_refcount

(u_int) A
reference count of requests for this particular membership.

Interface
Manipulation Functions

if_alloc()

Allocate and initialize struct
ifnet. Initialization includes the allocation of an interface
index and may include the allocation of a type specific
structure in if_l2com.

if_attach()

Link the specified interface ifp
into the list of network interfaces. Also initialize the list of
addresses on that interface, and create a link-layer ifaddr
structure to be the first element in that list. (A pointer to this
address structure is saved in the global array ifnet_addrs.)
The ifp must have been allocated by if_alloc().

if_detach()

Shut down and unlink the specified
ifp from the interface list.

if_free()

Free the given ifp back to the
system. The interface must have been previously detached if it was
ever attached.

if_free_type()

Identical to if_free() except
that the given type is used to free if_l2com instead
of the type in if_type. This is intended for use with
drivers that change their interface type.

if_down()

Mark the interface ifp as down
(i.e., IFF_UP is not set), flush its output queue, notify protocols
of the transition, and generate a message from the route(4) routing
socket.

if_up()

Mark the interface ifp as up,
notify protocols of the transition, and generate a message from the
route(4) routing socket.

ifpromisc()

Add or remove a promiscuous reference
to ifp. If pswitch is true, add a reference; if it is
false, remove a reference. On reference count transitions from zero
to one and one to zero, set the IFF_PROMISC flag appropriately and
call if_ioctl() to set up the interface in the desired
mode.

if_allmulti()

As ifpromisc(), but for the
all-multicasts (IFF_ALLMULTI) flag instead of the promiscuous
flag.

ifunit()

Return an ifnet pointer for the
interface named name.

ifioctl()

Process the ioctl request cmd,
issued on socket so by thread td, with data parameter
data. This is the main routine for handling all interface
configuration requests from user mode. It is ordinarily only called
from the socket-layer ioctl(2) handler, and only for commands with
class ‘i’. Any unrecognized commands will be passed
down to socket so’s protocol for further
interpretation. The following commands are handled by
ifioctl():

SIOCGIFCONF
OSIOCGIFCONF

Get interface
configuration. (No call-down to driver.)

SIOCSIFNAME

Set the interface name.
RTM_IFANNOUNCE departure and arrival messages are sent so that
routing code that relies on the interface name will update its
interface list. Caller must have appropriate privilege. (No
call-down to driver.)

Enable or disable
interface capabilities. Caller must have appropriate privilege.
Before a call to the driver-specific if_ioctl() routine, the
requested mask for enabled capabilities is checked against the mask
of capabilities supported by the interface, if_capabilities.
Requesting to enable an unsupported capability is invalid. The rest
is supposed to be done by the driver, which includes updating
if_capenable and if_data.ifi_hwassist
appropriately.

SIOCSIFFLAGS

Change interface flags.
Caller must have appropriate privilege. If a change to the IFF_UP
flag is requested, if_up() or if_down() is called as
appropriate. Flags listed in IFF_CANTCHANGE are masked off, and the
field if_flags in the interface structure is updated.
Finally, the driver if_ioctl() routine is called to perform
any setup requested.

SIOCSIFMETRIC
SIOCSIFPHYS

Change interface metric
or medium. Caller must have appropriate privilege.

SIOCSIFMTU

Change interface MTU.
Caller must have appropriate privilege. MTU values less than 72 or
greater than 65535 are considered invalid. The driver
if_ioctl() routine is called to implement the change; it is
responsible for any additional sanity checking and for actually
modifying the MTU in the interface structure.

SIOCADDMULTI
SIOCDELMULTI

Add or delete permanent
multicast group memberships on the interface. Caller must have
appropriate privilege. The if_addmulti() or
if_delmulti() function is called to perform the operation;
qq.v.

SIOCSIFDSTADDR
SIOCSIFADDR
SIOCSIFBRDADDR
SIOCSIFNETMASK

The socket’s
protocol control routine is called to implement the requested
action.

OSIOGIFADDR
OSIOCGIFDSTADDR
OSIOCGIFBRDADDR
OSIOCGIFNETMASK

The socket’s
protocol control routine is called to implement the requested
action. On return, sockaddr structures are converted into
old-style (no sa_len member).

if_down(),
ifioctl(), ifpromisc(), and if_up() must be
called at splnet() or higher.

Interface Address
Functions
Several functions exist to look up an interface address structure
given an address. ifa_ifwithaddr() returns an interface
address with either a local address or a broadcast address
precisely matching the parameter addr.
ifa_ifwithdstaddr() returns an interface address for a
point-to-point interface whose remote
(‘‘destination’’) address is
addr.

ifa_ifwithnet()
returns the most specific interface address which matches the
specified address, addr, subject to its configured netmask,
or a point-to-point interface address whose remote address is
addr if one is found.

ifaof_ifpforaddr() returns the most specific address
configured on interface ifp which matches address
addr, subject to its configured netmask. If the interface is
point-to-point, only an interface address whose remote address is
precisely addr will be returned.

All of these functions
return a null pointer if no such address can be found.

Interface Multicast
Address Functions
The if_addmulti(), if_delmulti(), and
ifmaof_ifpforaddr() functions provide support for requesting
and relinquishing multicast group memberships, and for querying an
interface’s membership list, respectively. The
if_addmulti() function takes a pointer to an interface,
ifp, and a generic address, sa. It also takes a
pointer to a struct ifmultiaddr * which is filled in on
successful return with the address of the group membership control
block. The if_addmulti() function performs the following
four-step process:

1.

Call the interface’s
if_resolvemulti() entry point to determine the link-layer
address, if any, corresponding to this membership request, and also
to give the link layer an opportunity to veto this membership
request should it so desire.

2.

Check the interface’s group
membership list for a pre-existing membership for this group. If
one is not found, allocate a new one; if one is, increment its
reference count.

3.

If the if_resolvemulti() routine
returned a link-layer address corresponding to the group, repeat
the previous step for that address as well.

4.

If the interface’s multicast
address filter needs to be changed because a new membership was
added, call the interface’s if_ioctl() routine (with a
cmd argument of SIOCADDMULTI) to request that it do so.

The if_delmulti()
function, given an interface ifp and an address, sa,
reverses this process. Both functions return zero on success, or a
standard error number on failure.

The
ifmaof_ifpforaddr() function examines the membership list of
interface ifp for an address matching addr, and
returns a pointer to that struct ifmultiaddr if one is
found, else it returns a null pointer.