npd6 Design Document

So what must npd6 do in functional terms?

See Neighbor Solicitations.

EITHER respond to them directly

OR respond via the existing mechanisms.

Log activity.

Report status.

Receive Neighbor Solicitations

The daemon needs to receive incoming neighbor solcitations from a designated port(s). I currently have no idea if this is an easy hook to make from user-space or whether we need to hook much lower. Kernel module? iptables? Main area for investigation this one!

Act upon Neighbor Solicitations

OK, so we receive an NS. What do we do about it?

Validate

First off, we validate it to a small degree. Likely scenario here is that in a conf file we have defined out static IPv6 prefix (in my scenario this will be 64-bits) We check it matches that. Likely in the conf we also want options to either explicitly include only defined addresses (ranges?) as valid (i.e. define valid suffixes) and/or exclude certain addresses/ranges?

Action

Key decision to be made is in what manner we act to a valid NS. The obvious two approaches are to either respond directly ourselves or instead to use the existing kernel mechanism to, in effect, add the address via the “ip -6 neigh proxy add…” command.

Pros and cons? Currently I have no idea how easy it is to spoof out a response. If easy, do I want to bother, when there’s an existing kernel mechanism do do so? So using that existing mechnaism seems attractive. We then, however, face the issue of how to manage the proxied addresses via that mechanism. Have we already added it? Should we check? Or blindly (re)add it? If we are to be stateful about it, how do we handle a restart situation, since I currently do not know how to get the kernel to cough out the current state of proxied addresses…

Needs a little investigation. 🙂

Logging and info

Pretty much goes without saying that we want our daemon to log meaningful data. Also optional debug logging.

And if we’re going down the route of using the existing kernel neighbor proxy mechanism, then we would also want a feature to signal the daemon and have it spit out the current state of neighbor proxying.

9 June

So my first steps here will be to dig around the kernel code currently handling NS. Get the feel of what’s going on there and see how it looks. Will update back here with impressions.

Update 1:

Just started having a read of the radvd code. I note that it appears you can open a socket to hook ICMP6 messages, which is highly promising if true! Since NSs are ICMP6 it would make a really neat, easy, user-spacey way to integrate with them. Have to look more closely at the code. Also need to think about whether you could do this in parallel with radvd doing it too. Still, at least it’s a line of investigation to follow.

14 June

Just a brief update: I put together a skeleton npd6 prototype/framework with which to investigate. Did the dull part, setting up an environment for parameters, simple logging, debug, etc. Can now start really looking at the interesting bits.

16 June

So spent a lot of time looking at how netlink and IPv6 biff along together in Linux. Turns out the answer is: not so well! Reading the radvd code had given me false optimism, as it does use netlink to hook router advertisements. However the summary seems to be that netlink and IPv6 are only really fully developed for IPv6 ICMp routing messages, and hence doesn’t seem to let me extend things to e.g. neighbor solicitations as we need.

So out of that blind alley, and back to good ol’ raw sockets. I can now happily hook any and all IPv6 ICMP and read the message. I don’t actually DO anything useful with it yet, but that’s no big deal. That fact that I can reliably (?) receive IPv6 ICMP in my user-space daemon is fine so far.

Next? Extend the processing of those received messages: spot the ones I’m interested in and decode them into debug logs.

23 June

OK, this rocks! The radvd code is an excellent crash course in IPv6 ICMP packet handling. LEarned a lot about how best to play the cmsg, iovec, mshhdr chain-game. Gave myself a headache in the process, but it’s good stuff. So my prototype is almost doing something useful now: it receives all NS sent to the gateway, and can compare their target against the user-defined prefix. If the target and prefix match up to the full prefix length it moves on and replies. The logic is very simple at this early stage: we don’t filter out, white/blacklist, etc. If the target matches the prefix, we’re going to respond. At the moment the bulk of NSs I receive on this device are actually for the interface’s address itself, so the box anyway NAs them in return. So for now, we’ll send an extra (dupe) NA for these. Obviously that would not be the long-term behaviour, but for now no harm -ICMP is stateless, and a dupe response is no big deal. Also, as a fun extra, it lets me benchmark my code by measuring the delay between the kernel-generated NA reply and my daemon reply!!

The NA I send back is almost right. Few fields and bits I need to understand a bit more, but this daemon is almost ready for alpha!

Anyone fancy testing it soon….? (I’ve a colleague lined up to test on his home network, but would love to see it outside of the Free network.)

24 June

A very difficult issue pops up. Code so far can trap any unicast ICMP arriving at the gateway, and respond with a NA. Great! It works. Then remember that the NS coming in to us can be either directed unicast (what I’ve tested with so far) or, actually more likely, multicast. OK, no biggie. But then we see that the multicast is directed in terms of having the bottom 48 bits set to the target address contained within the solicitation. So…. from the socket level we need to pick up these “directed multicasts” (or solicited-node messages) And that’s where things get tricky. To pick up a multicast I need (from socket level) to first join the multicast group. BUT these NSs, remember, don’t come in on a fixed multicast group.. So you cannot join the multicast group. It’s chicken + egg. You only know what multicast group will be used AFTER you’ve joined the right multicast group… I cannot believe this. But that’s how it seems right now. And makes me fear if the existing IPv6 socket layer can let this problem be overcome…. Ow. This is ugly.

The solicited-node address facilitates efficient querying of network nodes during address resolution. IPv6 uses the Neighbor Solicitation message to perform address resolution. In IPv4, the ARP Request frame is sent to the MAC-level broadcast, disturbing all nodes on the network segment regardless of whether a node is running IPv4. For IPv6, instead of disturbing all IPv6 nodes on the local link by using the local-link scope all-nodes address, the solicited-node multicast address is used as the Neighbor Solicitation message destination.

The solicited-node multicast address consists of the prefix FF02::1:FF00:0/104 and the last 24-bits of the IPv6 address that is being resolved.

The following steps show an example of how the solicited-node address is handled for the node with the link-local IPv6 address of FE80::2AA:FF:FE28:9C5A, and the corresponding solicited-node address is FF02::1:FF28:9C5A:

To resolve the FE80::2AA:FF:FE28:9C5A address to its link layer address, a node sends a Neighbor Solicitation message to the solicited-node address of FF02::1:FF28:9C5A.

The node using the address of FE80::2AA:FF:FE28:9C5A is listening for multicast traffic at the solicited-node address FF02::1:FF28:9C5A. For interfaces that correspond to a physical network adapter, it has registered the corresponding multicast address with the network adapter.

As shown in this example, by using the solicited-node multicast address, address resolution that commonly occurs on a link can occur without disturbing all network nodes. In fact, very few nodes are disturbed during address resolution. Because of the relationship between the network interface MAC address, the IPv6 interface ID, and the solicited-node address, in practice, the solicited-node address acts as a pseudo-unicast address for efficient address resolution.

tcpdump tip

Here’s a useful one, to tcpdump only NAs and NSs:

tcpdump -v -i eth0 ip6[40] == 135 or ip6[40] == 136

Ugly, ugly…

So turns out that, to the very best of my ability to discover, IPv6 sockets do not let me pick up any and all ICMP sent to the interface(s). If it’s to a multicast address, I must join the group beforehand. Which I cannot do. I have to say it’s a pretty glaring omission, IMHO. It’s “glaring” not maybe in the general case, but given IPv6 uses “unique multicast” addresses for Neighbor Solicitations, to not provide an IP-socket-level mechanism for receiving them is pretty crap.

So I assume I’m going to have to instead receive on a simple Packet level socket, and build a packet filter on it.

Packet filters

The more I look at it, the more it seems that until the IPv6 socket API expands, I’m going down this path. Easy enough to create a null filter. Done. 2 mins work. 🙂 Now to get to grips with the BSD packet Filter VM language to actually write the filter. I know that, in general, BSD-PF semantics for IPv6 are problematic… However I should be OK here, as I am just interested in IPv6 ICMP – I think that should be a straightforward enough packet dissect.

Update: 1 July BSD PF VM syntax. Wow. Kinda funky, but now almost makes sense. Hope to have a working ICMP6 packet filter working soon!!

3 July

Yes. Can now reliably capture ALL ICMP neighbor solcitations on my gateway box. Now working on restoring the functionality lost by not having the socket feed us ICMP – since we now get a raw packets we must work a bit harder to find out who sent it, interface, etc. – the pktinfo mechanism is not available to us (well, the mechanism itself is, but the context make the information is supplies not useful) so we must do that manually. So lost of casting away of IPv6 headers and so forth!

Multicast…

During development, so far I’ve almost always had tcpdump running on my WAN interface while testing code. Today was not running it and noted, much to my confusion, that the daemon saw NO multicast neighbor solicitations.Makes sense: tcpdump sets the interface to promiscuous mode. So we see everything. Since many of the NSs are multicast, if we haven’t signed up to the multicast group (as per above ad nauseam, we can’t…) we don’t see them at the packet filter level.

For now (and maybe for ever!) the compromise is that the interface has the ALLMULTI flag set on it. Setting PROMISC permanently would be a touch too ugly, but I think nominally receiving all multicast is acceptable. In these days of switched point-to-point Ethernet, all of this gets fairly academic anyway, as being in effect now a non-shared medium in so many instances the “extra” traffic handled by setting PROMISC is low-to-zero. But anyway, the Linux interface option of ALLMULTI is just fine.

In fact when I found out about the ALLMULTI interface flag, I even wondered if it would let my original design of having an ICMP6 socket open would now work… But no – even in full PROMISC mode the OCMP6-level socket does not receive multicast unless the socket has explicitly joined the specific group first.

RFC time

OK, so we can now kick out a Neighbor Advertisement in response to a Neighbor Solicitation received on the multicast group address. Cool.

Now to pay attention to the nature of it. As per the RFC 2461:

A node sends a Neighbor Advertisement in response to a valid Neighbor
Solicitation targeting one of the node's assigned addresses. The
Target Address of the advertisement is copied from the Target Address
of the solicitation. If the solicitation's IP Destination Address is
not a multicast address, the Target Link-Layer Address option MAY be
omitted; the neighboring node's cached value must already be current
in order for the solicitation to have been received. If the
solicitation's IP Destination Address is a multicast address, the
Target Link-Layer option MUST be included in the advertisement.
Furthermore, if the node is a router, it MUST set the Router flag to
one; otherwise it MUST set the flag to zero.

Soooooooooo, given all of that, the simplest logic will be to set Destination Link-Layer option for all our NAs.

11 August 2011

Haven’t updated this for a while – as the project has really taken off and is now living in its own right!!! Mt post at http://www.ipsidixit.net/2011/08/04/npd6/ indicates where it’s at and how to obtain it.