23.1. Programming using C-API

Following contents of this section is contributed by John Wenker, Sr. Software Engineer Performance Technologies San Diego, CA USA http://www.pt.com/.

This section describes how to write IPv6 client-server applications under the Linux operating system. First thing's first, and credit must be given where it is due. The information contained in this section is derived from Chapters 2 through 4 of IPv6 Network Programming by Jun-ichiro itojun Hagino (ISBN 1-55558-318-0). The reader is encouraged to consult that book for more detailed information. It describes how to convert IPv4 applications to be IPv6 compatible in a protocol-independent way, and describes some of the common problems encountered during the conversion along with suggested solutions. At the time of this writing, this is the only book of which the author is aware that specifically addresses how to program IPv6 applications [since writing this section, the author has also become aware of the Porting applications to IPv6 HowTo by Eva M. Castro at http://jungla.dit.upm.es/~ecastro/IPv6-web/ipv6.html]. Unfortunately, of the almost 360 pages in the book, maybe 60 are actually useful (the chapters mentioned). Nevertheless, without the guidance of that book, the author would have been unable to perform his job duties or compose this HowTo. While most (but certainly not all) of the information in the Hagino book is available via the Linux 'man' pages, application programmers will save a significant amount of time and frustration by reading the indicated chapters of the book rather than searching through the 'man' pages and online documentation.

Other than the Hagino book, any other information presented in this HowTo was obtained through trial and error. Some items or explanations may not be entirely “correct” in the grand IPv6 scheme, but seem to work in practical application.

The discussion that follows assumes the reader is already experienced with the traditional TCP/IP socket API. For more information on traditional socket programming, the Internetworking with TCP/IP series of textbooks by Comer & Stevens is hard to beat, specifically Volume III: Client-Server Programming and Applications, Linux/POSIX Sockets Version (ISBN 0-13-032071-4). This HowTo also assumes that the reader has had at least a bare basic introduction to IPv6 and in particular the addressing scheme for network addresses (see Section 2.3).

23.1.1. Address Structures

This section provides a brief overview of the structures provided in the socket API to represent network addresses (or more specifically transport endpoints) when using the Internet protocols in a client-server application.

23.1.1.1. IPv4 sockaddr_in

In IPv4, network addresses are 32 bits long and define a network node. Addresses are written in dotted decimal notation, such as 192.0.2.1, where each number represents eight bits of the address. Such an IPv4 address is represented by the struct sockaddr_in data type, which is defined in <netinet/in.h>.

The sin_family component indicates the address family. For IPv4 addresses, this is always set to AF_INET. The sin_addr field contains the 32-bit network address (in network byte order). Finally, the sin_port component represents the transport layer port number (in network byte order). Readers should already be familiar with this structure, as this is the standard IPv4 address structure.

23.1.1.2. IPv6 sockaddr_in6

The biggest feature of IPv6 is its increased address space. Instead of 32-bit network addresses, IPv6 allots 128 bits to an address. Addresses are written in colon-hex notation of the form fe80::2c0:8cff:fe01:2345, where each hex number separated by colons represents 16 bits of the address. Two consecutive colons indicate a string of consecutive zeros for brevity, and at most only one double-colon may appear in the address. IPv6 addresses are represented by the struct sockaddr_in6 data type, also defined in <netinet/in.h>.

The sin6_family, sin6_port, and sin6_addr components of the structure have the same meaning as the corresponding fields in the sockaddr_in structure. However, the sin6_family member is set to AF_INET6 for IPv6 addresses, and the sin6_addr field holds a 128-bit address instead of only 32 bits.

The sin6_flowinfo field is used for flow control, but is not yet standardized and can be ignored.

The sin6_scope_id field has an odd use, and it seems (at least to this naïve author) that the IPv6 designers took a huge step backwards when devising this. Apparently, 128-bit IPv6 network addresses are not unique. For example, it is possible to have two hosts, on separate networks, with the same link-local address (see Figure 1). In order to pass information to a specific host, more than just the network address is required; the scope identifier must also be specified. In Linux, the network interface name is used for the scope identifier (e.g. “eth0”) [be warned that the scope identifier is implementation dependent!]. Use the ifconfig(1M) command to display a list of active network interfaces.

A colon-hex network address can be augmented with the scope identifier to produce a "scoped address”. The percent sign ('%') is used to delimit the network address from the scope identifier. For example, fe80::1%eth0 is a scoped IPv6 address where fe80::1 represents the 128-bit network address and eth0 is the network interface (i.e. the scope identifier). Thus, if a host resides on two networks, such as Host B in example below, the user now has to know which path to take in order to get to a particular host. In Figure 1, Host B addresses Host A using the scoped address fe80::1%eth0, while Host C is addressed with fe80::1%eth1.

Getting back to the sockaddr_in6 structure, its sin6_scope_id field contains the index of the network interface on which a host may be found. Server applications will have this field set automatically by the socket API when they accept a connection or receive a datagram. For client applications, if a scoped address is passed as the node parameter to getaddrinfo(3) (described later in this HowTo), then the sin6_scope_id field will be filled in correctly by the system upon return from the function; if a scoped address is not supplied, then the sin6_scope_id field must be explicitly set by the client software prior to attempting to communicate with the remote server. The if_nametoindex(3) function is used to translate a network interface name into its corresponding index. It is declared in <net/if.h>.

23.1.1.3. Generic Addresses

As any programmer familiar with the traditional TCP/IP socket API knows, several socket functions deal with "generic" pointers. For example, a pointer to a generic struct sockaddr data type is passed as a parameter to some socket functions (such as connect(2) or bind(2)) rather than a pointer to a specific address type. Be careful... the sockaddr_in6 structure is larger than the generic sockaddr structure! Thus, if your program receives a generic address whose actual type is unknown (e.g. it could be an IPv4 address structure or an IPv6 address structure), you must supply sufficient storage to hold the entire address. The struct sockaddr_storage data type is defined in <bits/socket.h> for this purpose [do not #include this file directly within an application; use <sys/socket.h> as usual, and <bits/socket.h> will be implicitly included].

For example, consider the recvfrom(2) system call, which is used to receive a message from a remote peer. Its function prototype is:

The from parameter points to a generic sockaddr structure. If data can be received from an IPv6 peer on the socket referenced by s, then from should point to a data type of struct sockaddr_storage, as in the following dummy example:

As seen in the above example, ss (a struct sockaddr_storage data object) is used to receive the peer address information, but it's address is typecast to a generic struct sockaddr* pointer in the call to recvfrom(2).

23.1.2. Lookup Functions

Traditionally, hostname and service name resolution were performed by functions such as gethostbyname(3) and getservbyname(3). These traditional lookup functions are still available, but they are not forward compatible to IPv6. Instead, the IPv6 socket API provides new lookup functions that consolidate the functionality of several traditional functions. These new lookup functions are also backward compatible with IPv4, so a programmer can use the same translation algorithm in an application for both the IPv4 and IPv6 protocols. This is an important feature, because obviously a global IPv6 infrastructure isn't going to be put in place overnight. Thus, during the transition period from IPv4 to IPv6, client-server applications should be designed with the flexibility to handle both protocols simultaneously. The example programs at the end of this chapter do just that.

The primary lookup function in the new socket API is getaddrinfo(3). Its prototype is as follows.

The node parameter is a pointer to the hostname or IP address being translated. The referenced string can be a hostname, IPv4 dotted decimal address, or IPv6 colon-hex address (possibly scoped). The service parameter is a pointer to the transport layer's service name or port number. It can be specified as a name found in /etc/services or a decimal number. getaddrinfo(3) resolves the host/service combination and returns a list of address records; a pointer to the list is placed in the location pointed at by res. For example, suppose a host can be identified by both an IPv4 and IPv6 address, and that the indicated service has both a TCP entry and UDP entry in /etc/services. In such a scenario, it is not inconceivable that four address records are returned; one for TCP/IPv6, one for UDP/IPv6, one for TCP/IPv4, and one for UDP/IPv4.

The definition for struct addrinfo is found in <netdb.h> (as is the declaration for getaddrinfo(3) and the other functions described in this section). The structure has the following format:

Consult the 'man' page for getaddrinfo(3) for detailed information about the various fields; this HowTo only describes a subset of them, and only to the extent necessary for normal IPv6 programming.

The ai_family, ai_socktype, and ai_protocol fields have the exact same meaning as the parameters to the socket(2) system call. The ai_family field indicates the protocol family (not the address family) associated with the record, and will be PF_INET6 for IPv6 or PF_INET for IPv4. The ai_socktype parameter indicates the type of socket to which the record corresponds; SOCK_STREAM for a reliable connection-oriented byte-stream or SOCK_DGRAM for connectionless communication. The ai_protocol field specifies the underlying transport protocol for the record.

The ai_addr field points to a generic struct sockaddr object. Depending on the value in the ai_family field, it will point to either a struct sockaddr_in (PF_INET) or a struct sockaddr_in6 (PF_INET6). The ai_addrlen field contains the size of the object pointed at by the ai_addr field.

As mentioned, getaddrinfo(3) returns a list of address records. The ai_next field points to the next record in the list.

The hints parameter to getaddrinfo(3) is also of type struct addrinfo and acts as a filter for the address records returned in res. If hints is NULL, all matching records are returned; but if hints is non-NULL, the referenced structure gives "hints" to getaddrinfo(3) about which records to return. Only the ai_flags, ai_family, ai_socktype, and ai_protocol fields are significant in the hints structure, and all other fields should be set to zero.

Programs can use hints->ai_family to specify the protocol family. For example, if it is set to PF_INET6, then only IPv6 address records are returned. Likewise, setting hints->ai_family to PF_INET results in only IPv4 address records being returned. If an application wants both IPv4 and IPv6 records, the field should be set to PF_UNSPEC.

The hints->socktype field can be set to SOCK_STREAM to return only records that correspond to connection-oriented byte streams, SOCK_DGRAM to return only records corresponding to connectionless communication, or 0 to return both.

For the Internet protocols, there is only one protocol associated with connection-oriented sockets (TCP) and one protocol associated with connectionless sockets (UDP), so setting hints->ai_socktype to SOCK_STREAM or SOCK_DGRAM is the same as saying, "Give me only TCP records," or "Give me only UDP records," respectively. With that in mind, the hints->ai_protocol field isn't really that important with the Internet protocols, and pretty much mirrors the hints->ai_socktype field. Nevertheless, hints->ai_protocol can be set to IPPROTO_TCP to return only TCP records, IPPROTO_UDP to return only UDP records, or 0 for both.

The node or service parameter to gethostbyname(3) can be NULL, but not both. If node is NULL, then the ai_flags field of the hints parameter specifies how the network address in a returned record is set (i.e. the sin_addr or sin6_addr field of the object pointed at by the ai_addr component in a returned record). If the AI_PASSIVE flag is set in hints, then the returned network addresses are left unresolved (all zeros). This is how server applications would use getaddrinfo(3). If the flag is not set, then the address is set to the local loopback address (::1 for IPv6 or 127.0.0.1 for IPv4). This is one way a client application can specify that the target server is running on the same machine as the client. If the service parameter is NULL, the port number in the returned address records remains unresolved.

The getaddrinfo(3) function returns zero on success, or an error code. In the case of an error, the gai_strerror(3) function is used to obtain a character pointer to an error message corresponding to the error code, just like strerror(3) does in the standard 'C' library.

Once the address list is no longer needed, it must be freed by the application. This is done with the freeaddrinfo(3) function.

The last function that will be mentioned in this section is getnameinfo(3). This function is the inverse of getaddrinfo(3); it is used to create a string representation of the hostname and service from a generic struct sockaddr data object. It has the following prototype.

The sa parameter points to the address structure in question, and salen contains its size. The host parameter points to a buffer where the null-terminated hostname string is placed, and the hostlen parameter is the size of that buffer. If there is no hostname that corresponds to the address, then the network address (dotted decimal or colon-hex) is placed in host. Likewise, the serv parameter points to a buffer where the null-terminated service name string (or port number) is placed, and the servlen parameter is the size of that buffer. The flags parameter modifies the function's behavior; in particular, the NI_NUMERICHOST flag indicates that the converted hostname should always be formatted in numeric form (i.e. dotted decimal or colon-hex), and the NI_NUMERICSERV flag indicates that the converted service should always be in numeric form (i.e. the port number).

The symbols NI_MAXHOST and NI_MAXSERV are available to applications and represent the maximum size of any converted hostname or service name, respectively. Use these when declaring output buffers for getnameinfo(3).

23.1.3. Quirks Encountered

Before jumping into the programming examples, there are several quirks in IPv6 of which the reader should be aware. The more significant ones (in addition to the non-uniqueness of IPv6 network addresses already discussed) are described in the paragraphs below.

23.1.3.1. IPv4 Mapped Addresses

For security reasons that this author won't pretend to understand, "IPv4 mapped addresses" should not be allowed in IPv6-capable server applications. To put it in terms that everyone can understand, this simply means that a server should not accept IPv4 traffic on an IPv6 socket (an otherwise legal operation). An IPv4 mapped address is a mixed-format address of the form:

::ffff:192.0.2.1

where the first portion is in IPv6 colon-hex format and the last portion is in IPv4 dotted decimal notation. The dotted decimal IPv4 address is the actual network address, but it is being mapped into an IPv6 compatible format.

To prevent IPv4 mapped addresses from being accepted on an IPv6 socket, server applications must explicitly set the IPV6_V6ONLY socket option on all IPv6 sockets created [the Hagino book implies that this is only a concern with server applications. However, it has been observed during testing that if a client application uses an IPv4 mapped address to specify the target server, and the target server has IPv4 mapped addresses disabled, the connection still completes regardless. On the server side, the connection endpoint is an IPv4 socket as desired; but on the client side, the connection endpoint is an IPv6 socket. Setting the IPV6_V6ONLY socket option on the client side as well as the server side prevents any connection from being established at all.]. There's only one problem. Apparently, IPV6_V6ONLY isn't defined on all systems [or at least it wasn't in 2005 when the Hagino book was written]. The server example at the end of this chapter provides a method for handling this problem.

If IPv4 traffic cannot be handled on IPv6 sockets, then that implies that server applications must open both an IPv4 and IPv6 socket for a particular network service if it wants to handle requests from either protocol. This goes back to the flexibility issue mentioned earlier. If getaddrinfo(3) returns multiple address records, then server applications should traverse the list and open a passive socket for each address provided.

23.1.3.2. Cannot Specify the Scope Identifier in /etc/hosts

It is possible to assign a hostname to an IPv6 network address in /etc/hosts. For example, the following is an excerpt from the /etc/hosts file on the author's development system.

The "localhost" and "pt141" hostnames can be translated to either an IPv4 or IPv6 network address. So, for example, if "pt141" is passed as the node parameter to getaddrinfo(3), the function returns both an IPv4 and IPv6 address record for the host (assuming the behavior hasn't been modified by the hints parameter). Unfortunately, a scoped address cannot be used in /etc/hosts. Doing so results in getaddrinfo(3) returning only the IPv4 record.

23.1.3.3. Client & Server Residing on the Same Machine

Suppose a machine has the IPv4 address 192.0.2.1. A client application running on that machine can connect to a server application on the same machine by using either the local loopback address (127.0.0.1) or the network address (192.0.2.1) as the target server. Much to this author's surprise (and dismay), it turns out that an IPv6 client application cannot connect to a server application on the same machine if it uses the network address of that machine as the target; it must use the local loopback address (::1).

23.1.4. Putting It All Together (A Client-Server Programming Example)

Now it's time to put everything discussed thus far together into a sample client-server application. The remainder of this section is devoted to a remote time-of-day application (the 'daytime' Internet service) [I noticed that Ms. Castro used a 'daytime' example in her Porting applications to IPv6 HowTo. For the record, the source code presented here is original, developed from scratch, and any similarity between it and any other publicly available 'daytime' example is purely coincidental.]. The source code presented in this section was developed and tested on a RedHat Linux release using the 2.6 kernel (2.6.9 to be specific). Readers may use the source code freely, so long as proper credit is attributed; but of course the standard disclaimer must be given first:

Although the sample source code is believed to be free of errors, the author makes no guarantees as to its reliability, especially considering that some error paths were intentionally omitted for brevity. Use it at your own risk!

When you get right down to it, there really aren't that many differences between IPv4 and IPv6 applications. The trick is to code IPv6 applications in a protocol-independent manner, such that they can handle both IPv4 and IPv6 simultaneously and transparently. This sample application does just that. The only protocol-dependent code in the example occurs when printing network addresses in verbose mode; but only after the ai_family field in the addrinfo structure has been checked, so the programs know exactly what type of address they're handling at the time.

23.1.4.1. 'Daytime' Server Code

The server code is found in file tod6d.c (time-of-day IPv6 daemon). Once built, the server may be started using the following command syntax (assuming tod6d is the executable file):

tod6d [-v] [service]

ARGUMENTS:

service

The service (or well-known port) on which to listen. Default is "daytime".

OPTIONS:

-v

Turn on verbose mode.

The server handles both TCP and UDP requests on the network. The server source code contained in tod6d.c follows:

23.1.4.2. 'Daytime' TCP Client Code

The TCP client code is found in file tod6tc.c (time-of-day IPv6 TCP client). Once built, the TCP client may be started using the following command syntax (assuming tod6tc is the executable file):

tod6tc [-v] [-s scope_id] [host [service]]

ARGUMENTS:

host

The hostname or IP address (dotted decimal or colon-hex) of the remote host providing the service. Default is "localhost".

service

The TCP service (or well-known port number) to which a connection attempt is made. Default is "daytime".

OPTIONS:

-s

This option is only meaningful for IPv6 addresses, and is used to set the scope identifier (i.e. the network interface on which to establish the connection). Default is "eth0". If host is a scoped address, this option is ignored.

23.1.4.3. 'Daytime' UDP Client Code

The UDP client code is found in file tod6uc.c (time-of-day IPv6 UDP client). It is almost an exact duplicate of the TCP client (and in fact was derived from it), but is included in this HowTo for completeness. Once built, the UDP client may be started using the following command syntax (assuming tod6uc is the executable file):

tod6uc [-v] [-s scope_id] [host [service]]

ARGUMENTS:

host

The hostname or IP address (dotted decimal or colon-hex) of the remote host providing the service. Default is "localhost".

service

The UDP service (or well-known port number) to which datagrams are sent. Default is "daytime".

OPTIONS:

-s

This option is only meaningful for IPv6 addresses, and is used to set the scope identifier (i.e. the network interface on which to exchange datagrams). Default is "eth0". If host is a scoped address, this option is ignored.