The kernel mechanisms for handling network interfaces reside primarily
in the
.Vt ifnet , if_data , ifaddr ,
and
.Vt ifmultiaddr
structures in
.In net/if.h
and
.In net/if_var.h
and the functions named above and defined in
/sys/net/if.c.
Those interfaces which are intended to be used by user programs
are defined in
.In net/if.h ;
these include the interface flags, the
.Vt if_data
structure, and the structures defining the appearance of
interface-related messages on the
route(4)
routing socket and in
sysctl(3).
The header file
.In net/if_var.h
defines the kernel-internal interfaces, including the
.Vt ifnet , ifaddr ,
and
.Vt ifmultiaddr
structures and the functions which manipulate them.
(A few user programs will need
.In net/if_var.h
because it is the prerequisite of some other header file like
.In netinet/if_ether.h .
Most references to those two files in particular can be replaced by
.In net/ethernet.h . )

The system keeps a linked list of interfaces using the
TAILQ
macros defined in
queue(3);
this list is headed by a
.Vt struct ifnethead
called
ifnet.
The elements of this list are of type
.Vt struct ifnet ,
and most kernel routines which manipulate interface as such accept or
return pointers to these structures.
Each interface structure
contains an
.Vt if_data
structure, which contains statistics and identifying information used
by management programs, and which is exported to user programs by way
of the
ifmib(4)
branch of the
sysctl(3)
MIB.
Each interface also has a
TAILQ
of interface addresses, described by
.Vt ifaddr
structures; the head of the queue is always an
AF_LINK
address
(see
link_addr(3))
describing the link layer implemented by the interface (if any).
(Some trivial interfaces do not provide any link layer addresses;
this structure, while still present, serves only to identify the
interface name and index.)

Finally, those interfaces supporting reception of multicast datagrams
have a
TAILQ
of multicast group memberships, described by
.Vt ifmultiaddr
structures.
These memberships are reference-counted.

Interfaces are also associated with an output queue, defined as a
.Vt struct ifqueue ;
this structure is used to hold packets while the interface is in the
process of sending another.

(Vt char *)
The name of the interface,
(e.g.,
"fxp0"
or
"lo0").
(Initialized by driver.)

if_dname

(Vt const char *)
The name of the driver.
(Initialized by driver.)

if_dunit

(Vt int)
A unique number assigned to each interface managed by a particular
driver.
Drivers may choose to set this to
IF_DUNIT_NONE
if a unit number is not associated with the device.
(Initialized by driver.)

if_addrhead

(Vt struct ifaddrhead)
The head of the
queue(3)
TAILQ
containing the list of addresses assigned to this interface.

if_pcount

(Vt int)
A count of promiscuous listeners on this interface, used to
reference-count the
IFF_PROMISC
flag.

(Vt u_short)
A unique number assigned to each interface in sequence as it is
attached.
This number can be used in a
.Vt struct sockaddr_dl
to refer to a particular interface by index
(see
link_addr(3)).
(Initialized by
if_alloc.)

if_timer

(Vt short)
Number of seconds until the watchdog timer
if_watchdog
is called, or zero if the timer is disabled.
(Set by driver,
decremented by generic watchdog code.)

(Vt void *)
A pointer to an interface-specific MIB structure exported by
ifmib(4).
(Initialized by driver.)

if_linkmiblen

(Vt size_t)
The size of said structure.
(Initialized by driver.)

if_data

(Vt struct if_data)
More statistics and information; see
The if_data structure,
below.
(Initialized by driver, manipulated by both driver and generic
code.)

if_snd

(Vt struct ifqueue)
The output queue.
(Manipulated by driver.)

There are in addition a number of function pointers which the driver
must initialize to complete its interface with the generic interface
layer:

if_input

Pass a packet to an appropriate upper layer as determined
from the link-layer header of the packet.
This routine is to be called from an interrupt handler or
used to emulate reception of a packet on this interface.
A single function implementing
if_input
can be shared among multiple drivers utilizing the same link-layer
framing, e.g., Ethernet.

if_output

Output a packet on interface
ifp,
or queue it on the output queue if the interface is already active.

if_start

Start queued output on an interface.
This function is exposed in
order to provide for some interface classes to share a
if_output
among all drivers.
if_start
may only be called when the
IFF_OACTIVE
flag is not set.
(Thus,
IFF_OACTIVE
does not literally mean that output is active, but rather that the
devices internal output queue is full.)

if_done

Not used.
We are not even sure what it was ever for.
The prototype is faked.

if_ioctl

Process interface-related
ioctl(2)
requests
(defined in
.In sys/sockio.h ) .
Preliminary processing is done by the generic routine
ifioctl
to check for appropriate privileges, locate the interface being
manipulated, and perform certain generic operations like twiddling
flags and flushing queues.
See the description of
ifioctl
below for more information.

if_watchdog

Routine called by the generic code when the watchdog timer,
if_timer,
expires.
Usually this will reset the interface.

if_init

Initialize and bring up the hardware,
e.g., reset the chip and the watchdog timer and enable the receiver unit.
Should mark the interface running,
but not active
( IFF_RUNNING, ~IIF_OACTIVE).

if_resolvemulti

Check the requested multicast group membership,
addr,
for validity, and if necessary compute a link-layer group which
corresponds to that address which is returned in
*retsa.
Returns zero on success, or an error code on failure.

Interface flags are used for a number of different purposes.
Some
flags simply indicate information about the type of interface and its
capabilities; others are dynamically manipulated to reflect the
current state of the interface.
Flags of the former kind are marked
<S>
in this table; the latter are marked
<D>.

The macro
IFF_CANTCHANGE
defines the bits which cannot be set by a user program using the
SIOCSIFFLAGS
command to
ioctl(2);
these are indicated by an asterisk
(*)
in the following listing.

IFF_UP

<D>
The interface has been configured up by the user-level code.

IFF_BROADCAST

<S*>
The interface supports broadcast.

IFF_DEBUG

<D>
Used to enable/disable driver debugging code.

IFF_LOOPBACK

<S>
The interface is a loopback device.

IFF_POINTOPOINT

<S*>
The interface is point-to-point;
"broadcast"
address is actually the address of the other end.

IFF_RUNNING

<D*>
The interface has been configured and dynamic resources were
successfully allocated.
Probably only useful internal to the
interface.

IFF_NOARP

<D>
Disable network address resolution on this interface.

IFF_PROMISC

<D*>
This interface is in promiscuous mode.

IFF_PPROMISC

<D>
This interface is in the permanently promiscuous mode (implies
IFF_PROMISC).

IFF_ALLMULTI

<D*>
This interface is in all-multicasts mode (used by multicast routers).

IFF_OACTIVE

<D*>
The interfaces hardware output queue (if any) is full; output packets
are to be queued.

IFF_SIMPLEX

<S*>
The interface cannot hear its own transmissions.

IFF_LINK0 IFF_LINK1 IFF_LINK2

<D>
Control flags for the link layer.
(Currently abused to select among
multiple physical layers on some devices.)

IFF_MULTICAST

<S*>
This interface supports multicast.

IFF_POLLING

<D*>
The interface is in
polling(4)
mode.
See
Interface Capabilities Flags
for details.

Interface capabilities are specialized features an interface may
or may not support.
These capabilities are very hardware-specific
and allow, when enabled,
to offload specific network processing to the interface
or to offer a particular feature for use by other kernel parts.

It should be stressed that a capability can be completely
uncontrolled (i.e., stay always enabled with no way to disable it)
or allow limited control over itself (e.g., depend on another
capabilitys state.)
Such peculiarities are determined solely by the hardware and driver
of a particular interface.
Only the driver possesses
the knowledge on whether and how the interface capabilities
can be controlled.
Consequently, capabilities flags in
if_capenable
should never be modified directly by kernel code other than
the interface driver.
The command
SIOCSIFCAP
to
ifioctl
is the dedicated means to attempt altering
if_capenable
on an interface.
Userland code shall use
ioctl(2).

This interface can do checksum validation on receiving data.
Some interfaces do not have sufficient buffer storage to store frames
above a certain MTU-size completely.
The driver for the interface might disable hardware checksum validation
if the MTU is set above the hardcoded limit.

IFCAP_TXCSUM

This interface can do checksum calculation on transmitting data.

IFCAP_HWCSUM

A shorthand for
(IFCAP_RXCSUM | IFCAP_TXCSUM).

IFCAP_VLAN_HWTAGGING

This interface can do VLAN tagging on output and
demultiplex frames by their VLAN tag on input.

IFCAP_VLAN_MTU

The
vlan(4)
driver can operate over this interface in software tagging mode
without having to decrease MTU on
vlan(4)
interfaces below 1500 bytes.
This implies the ability of this interface to cope with frames somewhat
longer than permitted by the Ethernet specification.

IFCAP_JUMBO_MTU

This Ethernet interface can transmit and receive frames up to
9000 bytes long.

The ability of advanced network interfaces to offload certain
computational tasks from the host CPU to the board is limited
mostly to TCP/IP.
Therefore a separate field associated with an interface
(see
ifnet.if_data.ifi_hwassist
below)
keeps a detailed description of its enabled capabilities
specific to TCP/IP processing.
The TCP/IP module consults the field to see which tasks
can be done on an
outgoing
packet by the interface.
The flags defined for that field are a superset of those for
mbuf.m_pkthdr.csum_flags,
namely:

CSUM_IP

The interface will compute IP checksums.

CSUM_TCP

The interface will compute TCP checksums.

CSUM_UDP

The interface will compute UDP checksums.

CSUM_IP_FRAGS

The interface can compute a TCP or UDP checksum for a packet
fragmented by the host CPU.
Makes sense only along with
CSUM_TCP
or
CSUM_UDP.

CSUM_FRAGMENT

The interface will do the fragmentation of IP packets if necessary.
The host CPU does not need to care about MTU on this interface
as long as a packet to transmit through it is an IP one and it
does not exceed the size of the hardware buffer.

An interface notifies the TCP/IP module about the tasks
the former has performed on an
incoming
packet by setting the corresponding flags in the field
mbuf.m_pkthdr.csum_flags
of the
.Vt mbuf chain
containing the packet.
See
mbuf(9)
for details.

The capability of a network interface to operate in
polling(4)
mode involves several flags in different
global variables and per-interface fields.
First, there is a system-wide
sysctl(8)
master switch named
kern.polling.enable,
which can toggle
polling(4)
globally.
If that variable is set to non-zero,
polling(4)
will be used on those devices where it is enabled individually.
Otherwise,
polling(4)
will not be used in the system.
Second, the capability flag
IFCAP_POLLING
set in interfaces
if_capabilities
indicates support for
polling(4)
on the particular interface.
If set in
if_capabilities,
the same flag can be marked or cleared in the interfaces
if_capenable,
thus initiating switch of the interface to
polling(4)
mode or interrupt
mode, respectively.
The actual mode change will occur at an implementation-specific moment
in the future, e.g., during the next interrupt or
polling(4)
cycle.
And finally, if the mode transition has been successful, the flag
IFF_POLLING
is marked or cleared in the interfaces
if_flags
to indicate the current mode of the interface.

In
BSD 4.4 ,
a subset of the interface information believed to be of interest to
management stations was segregated from the
.Vt ifnet
structure and moved into its own
.Vt if_data
structure to facilitate its use by user programs.
The following elements of the
.Vt if_data
structure are initialized by the interface and are not expected to change
significantly over the course of normal operation:

ifi_type

(Vt u_char)
The type of the interface, as defined in
.In net/if_types.h
and described below in the
Interface Types
section.

ifi_physical

(Vt u_char)
Intended to represent a selection of physical layers on devices which
support more than one; never implemented.

ifi_addrlen

(Vt u_char)
Length of a link-layer address on this device, or zero if there are
none.
Used to initialized the address length field in
.Vt sockaddr_dl
structures referring to this interface.

ifi_hdrlen

(Vt u_char)
Maximum length of any link-layer header which might be prepended by
the driver to a packet before transmission.
The generic code computes
the maximum over all interfaces and uses that value to influence the
placement of data in
.Vt mbuf Ns s
to attempt to ensure that there is always
sufficient space to prepend a link-layer header without allocating an
additional
.Vt mbuf .

ifi_datalen

(Vt u_char)
Length of the
.Vt if_data
structure.
Allows some stabilization of the routing socket ABI in the face of
increases in the length of
.Vt struct ifdata .

ifi_mtu

(Vt u_long)
The maximum transmission unit of the medium, exclusive of any
link-layer overhead.

(Vt u_long)
A detailed interpretation of the capabilities
to offload computational tasks for
outgoing
packets.
The interface driver must keep this field in accord with
the current value of
if_capenable.

ifi_epoch

(Vt time_t)
The system uptime when interface was attached or the statistics
below were reset.
This is intended to be used to set the SNMP variable
ifCounterDiscontinuityTime.
It may also be used to determine if two successive queries for an
interface of the same index have returned results for the same
interface.

The structure additionally contains generic statistics applicable to a
variety of different interface types (except as noted, all members are
of type
.Vt u_long ) :

ifi_link_state

(Vt u_char)
The current link state of Ethernet interfaces.
See the
Interface Link States
section for possible values.

ifi_ipackets

Number of packets received.

ifi_ierrors

Number of receive errors detected (e.g., FCS errors, DMA overruns,
etc.).
More detailed breakdowns can often be had by way of a
link-specific MIB.

ifi_opackets

Number of packets transmitted.

ifi_oerrors

Number of output errors detected (e.g., late collisions, DMA overruns,
etc.).
More detailed breakdowns can often be had by way of a
link-specific MIB.

ifi_collisions

Total number of collisions detected on output for CSMA interfaces.
(This member is sometimes [ab]used by other types of interfaces for
other output error counts.)

ifi_ibytes

Total traffic received, in bytes.

ifi_obytes

Total traffic transmitted, in bytes.

ifi_imcasts

Number of packets received which were sent by link-layer multicast.

ifi_omcasts

Number of packets sent by link-layer multicast.

ifi_iqdrops

Number of packets dropped on input.
Rarely implemented.

ifi_noproto

Number of packets received for unknown network-layer protocol.

ifi_lastchange

(Vt struct timeval)
The time of the last administrative change to the interface (as required
for
SNMP).

Every interface is associated with a list
(or, rather, a
TAILQ)
of addresses, rooted at the interface structures
if_addrlist
member.
The first element in this list is always an
AF_LINK
address representing the interface itself; multi-access network
drivers should complete this structure by filling in their link-layer
addresses after calling
if_attach.
Other members of the structure represent network-layer addresses which
have been configured by means of the
SIOCAIFADDR
command to
ioctl(2),
called on a socket of the appropriate protocol family.
The elements of this list consist of
.Vt ifaddr
structures.
Most protocols will declare their own protocol-specific
interface address structures, but all begin with a
.Vt struct ifaddr
which provides the most-commonly-needed functionality across all
protocols.
Interface addresses are reference-counted.

The members of
.Vt struct ifaddr
are as follows:

ifa_addr

(Vt struct sockaddr *)
The local address of the interface.

ifa_dstaddr

(Vt struct sockaddr *)
The remote address of point-to-point interfaces, and the broadcast
address of broadcast interfaces.
( ifa_broadaddr
is a macro for
ifa_dstaddr.)

(TAILQ_ENTRY ifaddr)
queue(3)
glue for list of addresses on each interface.

ifa_rtrequest

See below.

ifa_flags

(Vt u_short)
Some of the flags which would be used for a route representing this
address in the route table.

ifa_refcnt

(Vt short)
The reference count.

ifa_metric

(Vt int)
A metric associated with this interface address, for the use of some
external routing protocol.

References to
.Vt ifaddr
structures are gained manually, by incrementing the
ifa_refcnt
member.
References are released by calling either the
ifafree
function or the
IFAFREE
macro.

ifa_rtrequest
is a pointer to a function which receives callouts from the routing
code
(rtrequest)
to perform link-layer-specific actions upon requests to add, resolve,
or delete routes.
The
cmd
argument indicates the request in question:
RTM_ADD, RTM_RESOLVE,
or
RTM_DELETE.
The
rt
argument is the route in question; the
dst
argument is the specific destination being manipulated
for
RTM_RESOLVE,
or a null pointer otherwise.

The functions provided by the generic interface code can be divided
into two groups: those which manipulate interfaces, and those which
manipulate interface addresses.
In addition to these functions, there
may also be link-layer support routines which are used by a number of
drivers implementing a specific link layer over different hardware;
see the documentation for that link layer for more details.

Every multicast-capable interface is associated with a list of
multicast group memberships, which indicate at a low level which
link-layer multicast addresses (if any) should be accepted, and at a
high level, in which network-layer multicast groups a user process has
expressed interest.

(Vt struct sockaddr *)
A pointer to the address which this record represents.
The
memberships for various address families are stored in arbitrary
order.

ifma_lladdr

(Vt struct sockaddr *)
A pointer to the link-layer multicast address, if any, to which the
network-layer multicast address in
ifma_addr
is mapped, else a null pointer.
If this element is non-nil, this
membership also holds an invisible reference to another membership for
that link-layer address.

ifma_refcount

(Vt u_int)
A reference count of requests for this particular membership.

Allocate and initialize
.Vt struct ifnet .
Initialization includes the allocation of an interface index and may
include the allocation of a
type
specific structure in
if_l2com.

if_attach

Link the specified interface
ifp
into the list of network interfaces.
Also initialize the list of
addresses on that interface, and create a link-layer
.Vt ifaddr
structure to be the first element in that list.
(A pointer to
this address structure is saved in the global array
ifnet_addrs.)
The
ifp
must have been allocated by
if_alloc.

if_detach

Shut down and unlink the specified
ifp
from the interface list.

if_free

Free the given
ifp
back to the system.
The interface must have been previously detached if it was ever attached.

if_free_type

Identical to
if_free
except that the given
type
is used to free
if_l2com
instead of the type in
if_type.
This is intended for use with drivers that change their interface type.

if_down

Mark the interface
ifp
as down (i.e.,
IFF_UP
is not set),
flush its output queue, notify protocols of the transition,
and generate a message from the
route(4)
routing socket.

if_up

Mark the interface
ifp
as up, notify protocols of the transition,
and generate a message from the
route(4)
routing socket.

ifpromisc

Add or remove a promiscuous reference to
ifp.
If
pswitch
is true, add a reference;
if it is false, remove a reference.
On reference count transitions
from zero to one and one to zero, set the
IFF_PROMISC
flag appropriately and call
if_ioctl
to set up the interface in the desired mode.

if_allmulti

As
ifpromisc,
but for the all-multicasts
(IFF_ALLMULTI)
flag instead of the promiscuous flag.

ifunit

Return an
.Vt ifnet
pointer for the interface named
name.

ifioctl

Process the ioctl request
cmd,
issued on socket
so
by thread
td,
with data parameter
data.
This is the main routine for handling all interface configuration
requests from user mode.
It is ordinarily only called from the socket-layer
ioctl(2)
handler, and only for commands with class
'i'.
Any unrecognized commands will be passed down to socket
so s
protocol for
further interpretation.
The following commands are handled by
ifioctl:

SIOCGIFCONF OSIOCGIFCONF

Get interface configuration.
(No call-down to driver.)

SIOCSIFNAME

Set the interface name.
RTM_IFANNOUNCE
departure and arrival messages are sent so that
routing code that relies on the interface name will update its interface
list.
Caller must have appropriate privilege.
(No call-down to driver.)

Enable or disable interface capabilities.
Caller must have appropriate privilege.
Before a call to the driver-specific
if_ioctl
routine, the requested mask for enabled capabilities is checked
against the mask of capabilities supported by the interface,
if_capabilities.
Requesting to enable an unsupported capability is invalid.
The rest is supposed to be done by the driver,
which includes updating
if_capenable
and
if_data.ifi_hwassist
appropriately.

SIOCSIFFLAGS

Change interface flags.
Caller must have appropriate privilege.
If a change to the
IFF_UP
flag is requested,
if_up
or
if_down
is called as appropriate.
Flags listed in
IFF_CANTCHANGE
are masked off, and the field
if_flags
in the interface structure is updated.
Finally, the driver
if_ioctl
routine is called to perform any setup
requested.

SIOCSIFMETRIC SIOCSIFPHYS

Change interface metric or medium.
Caller must have appropriate privilege.

SIOCSIFMTU

Change interface MTU.
Caller must have appropriate privilege.
MTU
values less than 72 or greater than 65535 are considered invalid.
The driver
if_ioctl
routine is called to implement the change; it is responsible for any
additional sanity checking and for actually modifying the MTU in the
interface structure.

SIOCADDMULTI SIOCDELMULTI

Add or delete permanent multicast group memberships on the interface.
Caller must have appropriate privilege.
The
if_addmulti
or
if_delmulti
function is called to perform the operation; qq.v.

SIOCSIFDSTADDR SIOCSIFADDR SIOCSIFBRDADDR SIOCSIFNETMASK

The sockets protocol control routine is called to implement the
requested action.

OSIOGIFADDR OSIOCGIFDSTADDR OSIOCGIFBRDADDR OSIOCGIFNETMASK

The sockets protocol control routine is called to implement the
requested action.
On return,
.Vt sockaddr
structures are converted into old-style (no
sa_len
member).

if_down,
ifioctl,
ifpromisc,
and
if_up
must be called at
splnet
or higher.

Several functions exist to look up an interface address structure
given an address.
ifa_ifwithaddr
returns an interface address with either a local address or a
broadcast address precisely matching the parameter
addr.
ifa_ifwithdstaddr
returns an interface address for a point-to-point interface whose
remote
("destination")
address is
addr.

ifa_ifwithnet
returns the most specific interface address which matches the
specified address,
addr,
subject to its configured netmask, or a point-to-point interface
address whose remote address is
addr
if one is found.

ifaof_ifpforaddr
returns the most specific address configured on interface
ifp
which matches address
addr,
subject to its configured netmask.
If the interface is
point-to-point, only an interface address whose remote address is
precisely
addr
will be returned.

All of these functions return a null pointer if no such address can be
found.

The
if_addmulti,
if_delmulti,
and
ifmaof_ifpforaddr
functions provide support for requesting and relinquishing multicast
group memberships, and for querying an interfaces membership list,
respectively.
The
if_addmulti
function takes a pointer to an interface,
ifp,
and a generic address,
sa.
It also takes a pointer to a
.Vt struct ifmultiaddr *
which is filled in on successful return with the address of the
group membership control block.
The
if_addmulti
function performs the following four-step process:

Call the interfaces
if_resolvemulti
entry point to determine the link-layer address, if any, corresponding
to this membership request, and also to give the link layer an
opportunity to veto this membership request should it so desire.

Check the interfaces group membership list for a pre-existing
membership for this group.
If one is not found, allocate a new one;
if one is, increment its reference count.

If the
if_resolvemulti
routine returned a link-layer address corresponding to the group,
repeat the previous step for that address as well.

If the interfaces multicast address filter needs to be changed
because a new membership was added, call the interfaces
if_ioctl
routine
(with a
cmd
argument of
SIOCADDMULTI)
to request that it do so.

The
if_delmulti
function, given an interface
ifp
and an address,
sa,
reverses this process.
Both functions return zero on success, or a
standard error number on failure.

The
ifmaof_ifpforaddr
function examines the membership list of interface
ifp
for an address matching
addr,
and returns a pointer to that
.Vt struct ifmultiaddr
if one is found, else it returns a null pointer.