Manipulating the Networking Environment Using RTNETLINK

NETLINK is a facility in the Linux operating system for user-space
applications to communicate with the kernel. NETLINK is an extension
of the standard socket implementation. Using NETLINK, an application
can send/receive information to/from different kernel features, such as
networking, to check current status and control them.

In this article, I describe how a programmer can use the networking
environment manipulation capability of NETLINK known as RTNETLINK.
I discuss some areas of use of RTNETLINK, the relevant
socket operations, the functionality, how RTNETLINK messages are formed
and finally, provide a set of sample code that uses RTNETLINK. RTNETLINK for the
IP version 4 environment is referred to as NETLINK_ROUTE, and for the IP
version 6 environment, it is referred to as NETLINK_ROUTE6. The
explanations given here are applicable for both
IP versions 4 and 6.

Developers of network layer protocol handlers can use RTNETLINK to
modify and monitor different components of networking, such as
the routing table and network interfaces. There are many existing and
upcoming protocol standards at the Internet Engineering Task Force (IETF)
that can be implemented in user space. These implementations will
require manipulating the routing and knowing what is being modified by
other processes. Some of these protocol categories are as follows:

Dynamic routing protocols: protocols of this category, including the Routing
Information Protocol (RIP), Open Shortest Path First (OSPF) and Exterior
Gateway Protocol (EGP) actively manage the routing environment of a
host while communicating with other equally capable hosts or routers in
the network or Internet.

Mobility protocols: hosts that are mobile and connect to different
networks at different times use protocols such as Mobile IP (MIP),
Session Initiation Protocol (SIP) and Network Mobility (NEMO) to manage
routing to maintain connectivity and continuity of communications.

Ad hoc networking protocols: hosts that are mobile and located in
places where there is no networking infrastructure, such as routers
and WLAN access points, require peer-to-peer communications with
differently configured hosts. Mobile computers of rescue workers in an
earthquake-struck area or other such emergencies can use ad hoc networking
protocols. These protocols, such as the Ad hoc On-demand Distance Vector
(AODV) and Optimized Link State Routing (OLSR), require managing the
routing to find and communicate with other hosts using neighboring hosts
as routers and gateways.

It helps reduce the complexity of the kernel code if you implement these
protocols in user space. Further, it simplifies the development and
testing of these protocols because of the availability of many user-space
development tools. Problems, such as kernel crashes, that are likely
with kernel-based code when testing or when used by end users will
not occur in a user-space protocol handler.

Socket Operations

The socket implementation of Linux allows two end points to
communicate. The socket API provides a standard set of functions and data
structures. With RTNETLINK, the two end points in communication are
user space and kernel space. The following sequence of socket calls have
to be made when manipulating the networking environment through
RTNETLINK:

Open socket.

Bind socket to local address (using process ID).

Send message to the other end point.

Receive message from the other end point.

Close socket.

The socket() function opens an unattached end point to communicate with
the kernel. The function prototype of this call is as follows:

int socket(int domain, int type, int protocol);

The domain refers to what type of socket is being used. For RTNETLINK,
we use AF_NETLINK (PF_NETLINK). type refers to the type of protocol
used when communicating. This can be raw (SOCK_RAW) or datagram
(SOCK_DGRAM). This is not relevant for RTNETLINK sockets and either can be
used. protocol refers to the exact NETLINK capability that we use;
in our case, it is NETLINK_ROUTE. This function returns an integer with a
positive number called the socket descriptor, if the socket opening was
successful. This descriptor will be used in all the future RTNETLINK calls
until the socket is closed. If there was a failure, a negative value is
returned, and the system error variable errno included in errno.h is set
to the appropriate error code.

The following is an example of a call to open an RTNETLINK socket:

int fd;
...
fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);

Once the socket is opened, it has to be bound to a local address. The user
application can use a unique 32-bit ID to identify the local address. The
function prototype of bind is as follows:

int bind(int fd, struct sockaddr *my_addr,
socklen_t addrlen);

To bind, the caller must provide a local address using the sockaddr_nl
structure. This structure in the linux/netlink.h #include file has the
following format:

The nl_pid must contain a unique ID, which can be created using the
return of the getpid() function. This function returns the process ID of
the current user process that opened the RTNETLINK socket. But, if our
process consists of multiple threads with each thread opening different
RTNETLINK sockets, a modified process ID can be used.

Once this structure is filled, the binding can be done. The bind function
returns zero if the operation succeeded. A negative number is returned in
the case of failure, and the system error variable is set. The following
is an example of calling bind:

If the operation you require is multicast-based, you must set
nl_groups to join the multicast group associated with the required
RTNETLINK operation. For example, if you want to be notified of the
changes to the routing table by other processes, you must OR (|) the
RTMGRP_IPV4_ROUTE and RTMGRP_NOTIFY.

Sending routing RTNETLINK messages to the kernel is done through the use
of the standard sendmsg() function of the socket interface. The following
is the prototype of this function:

ssize_t sendmsg(int fd, const struct msghdr *msg,
int flags);

msg is a pointer to a msghdr structure. The following is the format of
this structure:

The msg_name is a pointer to a variable of the type struct
sockaddr_nl. This is the destination address of the sendmsg()
function. Because this message is directed to the kernel, all variables of
sockaddr_nl will be initialized to zero, except the nl_family member
variable. The field msg_namelen should contain the size of a struct
sockaddr_nl.

msg_iov should contain a pointer to a struct iovec, which is filled with
the RTNETLINK message relevant to the request being made. The caller is
allowed to place multiple RTNETLINK requests, if required. msg_iovlen
points to the number of struct iovec structures that were placed in
msg_iov. The rest of the variables are initialized to zero.

To receive RTNETLINK messages, the recv() function is used. Here is the
prototype of this function:

ssize_t recv(int fd, void *buf, size_t len,
int flags);

The second and third variables are a pointer to a buffer to place the
bytes read and the length of this buffer, respectively. For RTNETLINK, the buffer will
contain a set of RTNETLINK messages that have to be read one after
the other using a set of macros provided in the netlink.h and rtnetlink.h
#include files. flags is a set of flags to indicate how the receive
should be performed. For RTNETLINK, this simply can be initialized to zero.

Once the socket communications are complete, the socket has to be closed
using the close() function. Here's the prototype of this function:

I M new for netlink socket programming. I am developing simple application which inform me when ever any Interface is make up/down using if/up/down/config or wire out from Link plug. now problem is i got two packet for every if/up/down or wire out event.

I am not able to solve problem and don't understand why this things happen.

Using this tutorial, i crated a function to add the route, I believe I am populating all the necessary elements of the data structures. However, the route is getting added wrongly. For any kind of route, the function only add 0.0.0.0 route with mask 255.255.255.255 and gateway 0.0.0.0. It points to the correct interface that i specify in RTA_OIF.

Hello Asanga...
I am newbie to Linux. As far as u have googled i found only your material for a sample.With my understanding on ur illustration I have made the below module to get the destination address and the gateway address when i give "route add -host 192.168.2.45 gw 202.34.2.1"
I get the gateway address as 192.168.2.45 from the module. but i expect the gateway to be 202.34.2.1...

Hi, I currently have the problem that i want to get the IPV6 Address of a device via rtnetlink. Your article was already very helpful, but I still cannot find out, which fields of struct ifaddrmsg I have to fill out if I pass it with a request so that i get the IP that I am looking for. I set ifa_family to AF_INET6 and ifa_index to the device that I am looking at. Nevertheless when parsing the "answer" buffer so to speak I get NLMSG_ERROR and nothing else. Well and there is the problem that the programm never does more than one iteration in while(1) but it also never leaves it .. well that is a different problem I guess. Still, I would like to know why you put the second break condition in there, is it not always true? You didn't even set nl_groups in your programm ?
Sorry for my bad English but it has been a frustrating day full of debugging.

Re. second break; Usually the end of a returned message is indicated by a NLMSG_DONE. But for monitoring of routing table changes, this will not work. Since the example code in this article was common, that second break is also part of the loop.

I have a question regarding receiving route updates from the kernel.
I have a process that waits for any routing table changes. It is able to get updates when ever a new route is added or deleted. It gets arround 52 bytes of data.

When I add a new route entry I get 52 bytes but, it fails to enter
"for(;NLMSG_OK(nlp, nll);nlp=NLMSG_NEXT(nlp, nll))" loop of read_reply() as given in this document and more over when I try to print "nlp->nlmsg_type" its always RTM_NEWROUTE even though I deleted a route entry in my previous operation.

What I want is...
1) When ever a new entry gets added read_reply() function should print the new entry that got added.
2) When ever a entry is deleted from the route table, it should print the entry that got deleted as well as nlp->nlmsg_type shud be RTM_DELROUTE so that I know that the netlink message I got is because of delete operation.

As far as I know, there isn't any RTNETLINK command to flush the routing cache. But after looking at the source code of the "ip" command suit I found that they write a -1 to
/proc/sys/net/ipv4/route/flush to flush the routing cache.

Hallo,
this article helps me to understand the way to implement a protocol.
But some questions are still confusing me.

If an application just wants to send message through a specified NIC ( e.g. the node has more than one NIC, like LAN, WLAN etc), how can the application just set this selectivly ?
Is it able to set up more than one NIC for sending/receiving at the same time, or it should be done in different threads ?

>If an application just wants to send message through a specified NIC ( e.g. the node has more than one NIC, like LAN, WLAN etc), how can the application just set this selectivly ?

I assume that you are asking about sending IP packets over an interface. If that is the case, you must use INET type sockets to do this.

> Is it able to set up more than one NIC for sending/receiving at the same time, or it should be done in different threads ?

What interface a packet takes, is usually decided by the routing table, depending on the destination address of the packet. But I think INET sockets also has a facility to send packets from a given interface (thru sendmsg())

Your code piece is almost the same except for the rta_len addition. If there is no problem here, also check whether you can add the same route entry that you are trying to add programatically using ip route add command. A frequent problem of adding gateways to routes is that the gateway should be reachable.