It’s useful to be able to set up miniature networks on a Linux machine, for development, testing or just for fun. Here we use veth devices and network namespaces to create a small virtual network, connected together with an Open vSwitch instance. I’m using a Raspberry Pi 3 for this, it’s less inconvenient when it goes wrong, but I don’t think anything is Pi specific (and I certainly wouldn’t recommend a Pi for serious routing applications).

A veth device pair is a virtual ethernet cable, packets sent on one end come out the other (and vice versa of course):

I could assign an IP to both ends and try to send traffic through the link, but since the system knows about both ends, the traffic would get sent directly to the destination interface. Instead, I need to hide one end in a network namespace, each namespace has a set of interfaces, routing tables etc. that are private to that namespace. Initially everything is in the global namespace and we can create a new namespace, which are often named after colours, with the ip command:

Note that veth0 is in state LOWERLAYERDOWN because veth1 is now DOWN (as is the local interface in the namespace). We can now assign addresses to veth0 and veth1 and make sure all the interfaces are up:

That’s all we need for the most basic setup. Now we’ll add a second namespace and connect everything together with a switch – we could use a normal Linux bridge for this, but it’s more fun to use Open vSwitch and later use some very basic Openflow commands to set up a learning switch.

For a more complicated setup it’s usually a good idea to enable IP forwarding, so while we remember:

but sadly the source address is still in the 10.0.0.0 subnet and it’s not surprising that the Google DNS server isn’t responding.

Now, part of this exercise is to find out about Open vSwitch and its capabilities and I would hope that they would include setting up simple NAT translation, but I have no idea how to do that right now, so we’ll just use IP tables, so set up NAT and make sure forwarding is enabled:

Before when we created the OVS bridge, it started in “Normal” mode, with a single flow rule that sends every incoming packet out of every interface (except the one that it came in on), so the bridge is acting like a hub. Setting “fail-mode=secure” means there are no default rules so all packets are dropped.

First, if we have been playing, it’s a good idea to clear the rule table:

ovs-ofctl del-flows ovsbr0

Now set up the learning rules. The idea is that when a packet comes in from a particular MAC address, the switch remembers which interface the packet arrived on, so when it wants to send a packet to that address, it can just send it on the interface recorded earlier. We can do a similar thing with the local interface so we don’t need to configure the rules to handle whatever the local MAC address is (maybe there is a better way to handle the local interface – comments welcome).

The first rule says that when a packet originating locally is received, ie. that is being sent from a local process, add a rule (to table 10) that says that when an incoming packet is received, addressed to same MAC address, put the value 0xFFFF in the lower 16 bits of register 0. The second is the same but for packets received from the other interfaces in the switch, add a rule that puts the interface number in register 0. Having added a rule, processing continues with table 1.

The first rule sends packets with a broadcast ethernet address directly through to table 2, the second rule goes through table 10 first – the idea being that if the packet is being sent to a known MAC address, table 10 will put the number of the interface in register 0, or 0xffff if it’s the local MAC address, or 0 if the interface hasn’t been learned yet.

Finally table 2 just sends packets off to the right place using the register 0 values:

Let’s now have a look at getting it all working with IPv6. Changing the address swapping code is fairly straightforward, and for some extra points we’ll add a facility for printing out the source and destination addresses of each packet forwarded, the correct function for doing this is now inet_ntop, which works for both v4 and v6 addresses.

This address is also the link-local address of my Wifi interface, but there is no ambiguity as we must specify which interface to use.

We can also add a private network address. IPv6 does not have the same concept of a private network as IPv4, instead we define Unique Local Addresses: append 0xfd to a random 10 digit hex global id and add an arbitrary 4 digit subnet identifier. Any random global id is fine – the idea is to ensure that any given network will have a different id from any other private network it is likely to come in contact with – we don’t need to worry about true global uniqueness though the Birthday Paradox tells us that we are likely to have a potential conflict with only about a million private networks (there might be lots of people out there with the same name and birthday as you, but you are unlikely to meet one of them at random).

We can generate our own random address, for example, using the method described in RFC4193, or use /dev/random:

$ hexdump -v -e '/1 "%02x"' -n 5 /dev/urandom; echo
2acd2c8bc4

or just copy a random sequence from somewhere on the Internet, for example, the one used here:

$ ip -6 route add fd2a:cd2c:8bc4:0::/64 dev tun0

This adds a local network with a global id of 2acd2c8bc4 and a subnet id of 0.

We can also define a larger subnet:

$ ip -6 route add fd2a:cd2c:8bc4:1100::/56 dev tun0

Now traffic to any IPv6 address of form fd2a:cd2c:8bc4:11xx:… will be sent to our TUN device:

Those ff02::2 addresses are for IPv6 router discovery. The rest are two TCP flows, one in each direction (we can get more detail from Wireshark, in particular, the relevant port numbers, but this gives the general idea).

TUN devices are much used for virtualization, VPNs, network testing programs, etc. A TUN device essentially is a network interface that also exists as a user space file descriptor, data sent to the interface can be read from the file descriptor, and data written to the file descriptor emerges from the network interface.

Here’s a simple example of their use. We create a TUN device that simulates an entire network, with traffic to each network address just routed back to the original host.

We want a TUN device (rather than TAP, essentially the same thing but at the ethernet level) and we don’t want packet information at the moment. We copy the name of the allocated device to the char array given as a parameter.

Now all our program needs to do is create the TUN device and sit in a loop copying packets: