I noticed NetBSD now has an IP_PKTINFO socket option and code
to support delivering the struct in_pktinfo information to
recvmsg(), but lacks the code to process this option on the
sending side of the protocol. It is clear that the purpose
of IP_PKTINFO is to provide a way for UDP, and datagram protocols
generally, to implement the requirement in RFC 1123 section 2.3
that UDP service applications return responses with the IP source
address set to the "specific destination address" of the corresponding
request (the ipi_spec_dst member in the Linux version of the structure
is named for that so it is hard to miss the connection with this bit
of standardese). The IP_PKTINFO option is hence of no use for its
purpose without sendmsg() support for the control message struct
in_pktinfo, but on NetBSD including this (or anything) in the
sendmsg() msg_control buffer for a UDP socket results in sendmsg()
returning an error.
This leads me to believe that NetBSD's IP_PKTINFO option the way it
is is probably of negative utility. The information delivered by it
to recvmsg() is already available on NetBSD with other options, but
applications seeing the existence of the IP_PKTINFO option would
reasonably expect that to mean that the sendmsg() side of this works
as well, as it does on the other systems which implement this. It
would be best if the sending support for the option were completed,
but if not it would probably be be better if the option were removed.
This does raise the question of how sendmsg() control message
options should be implemented in general, since right now NetBSD
has no support for this. I believe the general intent of this
should be to allow a simple UDP (or other datagram) service to
be implemented by the server opening and binding the local port
to the listening sockets, then setting the options on the socket
to request relevant recvmsg() control information, and then
operating by taking the msg_name and msg_control blobs delivered
by recvmsg() along with a request and giving them unchanged to
the sendmsg() call which uses the information to build an
appropriate response. The service implementation shouldn't
necessarily have to parse the contents of the msg_control
blob, just passing it to sendmsg() should cause the right thing
to happen. That means the option numbers should be common
between recvmsg() and sendmsg(), and the form of the data for
each option which sendmsg() expects to see is that which was
written by recvmsg(). Also, sendmsg() should parse the control
message buffer for options relevant for the construction of the
outbound packet (IP_PKTINFO matters, as might IP_RECVRETOPTS
and IP_TOS if someone actually wanted those) but silently
ignore anything else. The original Berkeley code sort of
did that by silently dropping any msg_control options (it
used none of the options) but NetBSD at some point instead
made unknown sendmsg() control options an error, which is probably
less desirable behaviour.
I also think the omission of the Linux ipi_spec_dst from
struct in_pktinfo is fine (Windows doesn't have it), but it
leads to the following algorithm for processing a IP_PKTINFO
option for sending to get the result RFC 1123 wants:
1. If ipi_addr is the local address of some interface in the
box then use it as the source address in the packet being
constructed.
2. Otherwise, if ipi_ifindex is non-zero, choose the source
address of the packet as you would if the packet were being
sent out this interface (if the destination address is a
multicast or undirected broadcast address the packet should
actually be sent out this interface in any case, so skipping
to 3. will work too; otherwise the actual outgoing interface
may be different).
3. Otherwise, choose the source address of the packet as it
would be if the option weren't present.
If this is made to work then the built-in inetd datagram services
should probably be changed from recvfrom()/sendto() to
recvmsg()/sendmsg() since they all sometimes do unexpected
things on multihomed hosts.
I also think the new IP_RECVPKTINFO option provides no value
and should be removed. The information it provides is already
available from recvmsg(), and it is of no particular use in
constructing an appropriate response packet.
Dennis Ferguson