Virtualizing the OpenBSD Routing Table

Introduction

The OpenBSD routing table can be carved into multiple virtual routing tables allowing complete logical separation of attached networks. This article gives a brief overview of rtables and explains how to successfully leak traffic between virtual routing domains.

The ability to virtualize the routing table in OpenBSD first appeared in version 4.6. Since then the functionality has matured nicely with support for virtual routing tables now present in userland tools such as dhclient(8) and dhcpd(8) and in the routing protocol daemons ripd(8), ospfd(8), and bgpd(8). Kernel side, pf(4) has been extended to handle filtering of packets based on the routing table they came in on as well as being able to move packets between routing tables. This article will concentrate on the latter with examples of how to setup separate routing tables and leak traffic between them successfully.

Using separate routing tables is similar to using VRFs in Cisco IOS or routing instances in Juniper’s JUNOS. Multiple routing tables are created each of which contain their own forwarding and ARP information. In OpenBSD, each routing table is called an “rtable”. Network interfaces can be bound to an rtable which causes traffic going through the interface to be forwarded based on the information present in that rtable. When one or more interfaces are bound to an rtable, the rtable and all of the interfaces bound to it are called a routing domain, or “rdomain”.

Basic Configuration

Creating an rtable is done using route(8) with the -T argument.

# route -T 1 add 0.0.0.0/0 192.168.1.1

This creates rtable 1 if it doesn’t already exist and adds a default route to it.

Interfaces are bound to an rtable using ifconfig(8) with the “rdomain” keyword.

# ifconfig vic1 rdomain 1

This binds the vic1 interface to rtable 1.

To execute a command within a non-default rtable, use the route(8) command with the exec keyword.

$ route -T 1 exec telnet 10.5.3.29

This executes the telnet command within rtable 1. Certain commands such as ping(8) and arp(8) have their own command line arguments that will place them into an rtable (the -V argument in this case).

Setting up rdomains

By default, all interfaces on an OpenBSD host belong to rdomain 0. Traffic can flow freely between all interfaces (assuming the pf(4) ruleset allows it) without any special handling. Similarly, traffic can flow between all interfaces in the same non-default routing domain without any special handling (again, as long as the pf(4) ruleset passes this traffic).

Two OpenBSD Routing Domains

In this network, Host 1 and Host 2 both belong to rdomain 1. Routing domain 1 has routes to the 192.168.1/24 and 172.16.0/24 networks because they are directly attached so traffic between the two is forwarded without any special consideration. Host 1 and 2 cannot talk to Host 0 because Host 0 is connected to a separate routing domain.

As shown in the picture, pf(4) is used to connect routing domains. This is really powerful because pf(4) allows for very fine-grained packet matching which means you can be as specific or broad as you want when it comes to what traffic you want to pass between rdomains. Sending traffic between rdomains is done by using the rtable keyword in pf.conf.

pass in on vic1 to 172.16.2.0/24 rtable 0
pass out on vic0

This is the basic ruleset needed to allow Host 1 to initiate a connection to Host 0.

The rtable must be specified on the rule that matches traffic inbound to the OpenBSD router. As stated in the pf.conf(5) man page, the resulting route lookup will only work correctly if the rtable is specified on the inbound rule. This ruleset is not enough for traffic to flow bidirectionally. We also have to look at the routing entries within the source and destination routing domains.

The source routing domain, in this case rdomain 1, is easy. pf(4) will magically handle taking the packets out of rdomain 1 and sending them to rdomain 0 — we do not need a route for 172.16.2/24 in rtable 1. Reverse traffic is different. Routing domain 0 requires a route be present for 192.168.1/24. The next-hop for this route isn’t really important, what’s important is that it’s present in the rtable. If a route isn’t present, then the route lookup will fail before pf(4) has a chance to move the packet into rdomain 1 and the return traffic will be dropped. Note that the route doesn’t have to be exactly 192.168.1/24, it could be 192.168/16 or even 0.0.0.0/0 — the important part is that there is some kind of route in rtable 0 that will match the network in rdomain 1.

# route -T 0 add 192.168.1/24 -iface 172.16.2.137

This is kind of a cheat. It creates a route for 192.168.1/24 as a connected route on the rdomain 0 interface. Obviously this isn’t correct, but it doesn’t really matter. It achieves the goal of getting a route into rtable 0. Host 1 can now successfully talk to Host 0.

An alternative to creating a “connected” route is to set the next-hop of the 192.168.1/24 route to the loopback IP.

# route -T 0 add 192.168.1/24 127.0.0.1

The loopback interface provides a really convenient place to point your reverse path routes.

The caveat with this is that pf(4) must be active on the loopback interface you create. The default pf.conf ruleset contains “set skip on lo” which disables pf(4) on each loopback interface and will result in return traffic being dropped. Be sure that your loopback isn’t being “skipped”.

The same idea works between two non-default routing domains.

Three OpenBSD Routing Domains

Creating a loopback interface in rdomain 2 so that Host 1 can talk to Host 2 would look like:

Since lo2 is created inside rdomain 2, the IP address assigned to it doesn’t conflict with lo0 in rdomain 0.

Another caveat with the pf(4) ruleset is that the states that get created by the rule that specifies the rtable must be “floating”.

If you’ve changed the “state-policy” option in your pf.conf from the default of “floating” then you must use the “floating” keyword in your inbound rule.

set state-policy if-bound
pass in on vic1 to 172.16.2.0/24 rtable 0 keep state (floating)
pass out on vic0

All of the above guidance also applies if you’re doing NAT on the outbound interface.

pass in on vic1 to 172.16.2.0/24 rtable 0
pass out on vic0 nat-to vic0

This ruleset would hide the 192.168.1/24 network from hosts in rdomain 0 by translating the source 192.168.1.x IP to the IP address on the vic0 interface. This might be necessary if there’s already a 192.168.1 network in rdomain 0. Even though you’re doing NAT, you still need a route in rdomain 0 that points back to the real source network (192.168.1/24) in rdomain 1.

Sample Use Cases

Routing domains can be used to isolate a test/dev network from production.

Two OpenBSD Routing Domains

In the sample network from earlier, rdomain 0 could be the production network with production servers and the users connected to it. Routing domain 1 could be a test network where applications and systems are put through testing before being moved into rdomain 0. In order to prevent the test systems from possibly affecting the production systems, they could be isolated in their own routing domain, ensuring that test traffic cannot get into the production network. In fact, the test network could even use the same IP addresses as the production network without them stepping on each other. A pf(4) ruleset could be written that lets management/administrative traffic from the production network into test. A ruleset could also be written that allows the test systems to talk to a specific management or file server in the production network. If overlapping IP space is used, traffic between the rdomains must be NAT’d as outlined above.

Routing domains can also be used to connect to multiple ISPs. Since userland tools such as dhclient(8) work properly within routing domains, each ISP interface could be put into its own routing domain without the risk of conflicting default routes.

Three OpenBSD Routing Domains

Here if vic1 is connected to ISP#1 and vic2 is connected to ISP#2, the pf(4) ruleset would control which ISP connection to use when users in rdomain 0 connect to the Internet. This provides a much more elegant solution than the outbound load balancing example I wrote about in the PF User’s Guide.

The only shared component of a multiple-dhclient(8) setup is the resolv.conf(5) file. Each copy of dhclient(8) will update the file as it renews its lease.

Conclusion

By virtualizing the OpenBSD routing table you can create virtual routers and/or firewalls within the same physical OpenBSD machine. Networks can be safely isolated from each other without having to worry about traffic crossing network boundaries or IP addresses overlapping. Routing domains can be created by binding one or more interfaces to a routing table so that all traffic crossing those interfaces is automatically forwarded based on the routes present in the virtualized routing table. Traffic can be leaked between routing domains by using the granular pf(4) packet matching syntax to allow policy-based communication between routing domains.

I haven’t actually tested tunnel interfaces in an rdomain. I would assume that if you’re doing something like

ifconfig gif0 rdomain 5
ifconfig gif0 tunnel 1.1.1.1 2.2.2.2

that your local 1.1.1.1 interface would also need to be in rdomain 5 and that you’d need a route to 2.2.2.2 in rdomain 5. Is that how you’re doing it?

Have you seen the tunneldomain ifconfig(8) option? It’ll let you place the inner tunnel traffic into an rdomain.

I’m curious now if the gif interface is put into an rdomain and a tunneldomain is not configured whether the inner tunnel traffic would be routed in rdomain 0 or in the gif’s rdomain. That’d be interesting to test.

In your example, 1.1.1.1 and 2.2.2.2 are in rdomain 0 by default. This was wonderful to use serveral rdomains with ipsec+gif before reyk explained how to use serveral enc(4) interfaces.
The tunneldomain option is used to place the 2 IP addresses above in the rdomain you want.

This is a great resource for using rdomains and really helps over using just the man pages. I hope that sometime this will get picked up in the FAQ. I have two questions though:

1) You mention needing to put a route into the domain for return traffic even if using NAT, but I haven’t run into that using 4.9. Was there a change? My test setup used rdomain 0 for the internal IF and rdomain 1 for the outside. The /etc/mygate is set to the ip of the internal IF for rdomain 0 and a default route pointing to the ISP router is added to rdomain 1 via /etc/hostname !directive. Does this work on accident, or should I add a lo2 and a route to the internal (rdomain 0) net in rdomain 1?

2) I have also noticed that any time I create an alias in any rdomain the entry is added to rdomain 0 route table and not the one I specify even with a complete line “family alias address netmask broadcast rdomain n” in my hostname file. Is that normal, or am I missing something?

Sorry for the kind of basic questions, but your post is the first I have run into that does a good job explaining. I am using rdomains to handle a dual-ISP setup, but it is a constant learning process for me.

A1. Your setup is actually working exactly as I described it should. The default route you have in rdomain 1 is enough to match the destination of the return traffic. So for a packet on the return path, it’ll be run through NAT so the packet has the internal, rdomain 0 IP as its destination address. That destination is looked up in rtable 1 and matches the default route there. Since the lookup was successful the process continues and pf moves the packet into rdomain 0 where it is sent to the end host.

A2. I didn’t actually test aliases. That’s a good one. To me, that looks like a bug. Possibly ifconfig is not passing the rdomain to the kernel or the kernel is not adding the host route in the proper rdomain.

Just did some quick testing with alias. I can’t reproduce what you’ve described. As long as my interface is already in an rdomain, all aliases are added correctly and routes are put into the proper rtable. This is on 4.9 and 5.0.

After talking out of band with Ilya, this turned out to be a case where sshd was running on the OpenBSD router where rdomains were configured and when trying to connect from a non-default rdomain to that sshd instance, the router was responding with a TCP RST.

One thing the article above doesn’t talk about is sockets. Sockets actually belong to an rdomain. If sshd is started like normal (ie, “/usr/sbin/sshd”), then it opens a listening socket in rdomain 0. As far as rdomain 1 is concerned, port 22 is not listening and the kernel will respond with a TCP RST to any inbound connection attempts to port 22. The solution is to launch an sshd instance for rdomain 1 (ie, “route -T1 exec /usr/sbin/sshd”). This sshd instance will open a listening socket, and because it’s been exec’d in rdomain 1, that socket will be associated with that rdomain.

Awesome, using routing domains enabled me to build a working proxy-ARP setup for my OpenBSD 5.0 router (ISP gw with x.y.z.1/24 directly on-link, no route from ISP to our own box, and ISP gw itself proxy-arp’s the entire internet to itself).

Separate routing domain for em0 (x.y.z.2/24) to ISP, populate it with `arp -s x.y.z.4-254 pub permanent`, and then set up pf `match in on … to … rtable …` rules on em0 and em1 (x.y.z.254/24 + pub arps for x.y.z.1-2).

It works perfectly for forwarded traffic – but fails for all locally originated traffic on the default routing domain :(

In this setup, rdomain 2 (em0 / ISP link) has the real default route (x.y.z.1), whereas rdomain 0 (em1 / LAN) has a fake default route (also for x.y.z.1, which is either alias’d or arp-proxied back to em1). Seemingly, locally generated traffic never ingresses on any interface, and `rtable …` rules only work on ingress :(

How should the default routing domain and its default route be set up to best keep outgoing traffic from the host itself working, in addition to forwarded traffic?

Hi Tero, thanks for explaining your unique setup. You are right, traffic on the router will not pass through the inbound pf rules and won’t be moved into a different rdomain. The software/daemons on the box need to specify which rdomain their socket should use when they create it otherwise they are stuck in rdomain 0. Lots of OpenBSD daemons have command line or conf file options to set their rdomain. For everything else you’ll have to use “route -Tx exec”.

For simplicity and to avoid the issue you’re having, it’s probably best practice for everyone to keep the main ISP or the one that doesn’t do anything oddball in rdomain 0. It should be thought of as the default ISP and traffic is only leaked to other ISPs/rdomains as needed. This keeps the rule set simple and allows traffic off the router without issue.

Moving the WAN to rdomain0 doesn’t help, because then the routes/hosts on your LAN/rdomainX become unreachable, in the same way… And besides, in this case, the WAN rdomain is full of /32 proxy-arp’d routes for all the LAN hosts, and those are all “fake”..

Well, this is starting to feel a little scary even for me, but I actually managed to hack this into a working state by applying a little NAT:

pass out on $lan_if from self to route lan-egress rtable $wan_domain nat-to $wan_if

$lan_if being the rdomain0/default route destination, and the default route having a ‘lan-egress’ -label attached – very neat way to do the basic `match out on $lan_if to route rtable $wan_domain` btw.

—

Remaining problem I still have is that TCP RST’s generated by `block out on $lan_if …` (i.e. after a `match in on $wan_if … rtable 0` change) are sent out on $lan_if… and the sender behind $wan_if ends up getting an `icmp host … unreachable` reply for blocked ports :/

1) TCP SYN comes in on em0, rdomain 2
2) `match in on em0 … rtable 0`
3) TCP SYN packet routes out on em1 in rdomain 0)
4) `block out on em1 …`
5) pf generates a TCP RST… and sends it out on the rdomain 0 fake default route intended for leaking traffic to rdomain 2 (i.e. out on lo1) – which doesn’t go anywhere, since it never matches the `match in on em1 … rtable 2` -rule used for routing forwarded traffic.

All in all, it seems that routing traffic between domains doesn’t work all that well :/

Is there any way to work around this? The generated TCP RST in rdomain 0 doesn’t seem to go through pf at all.

pf doesn’t keep track of which rdomain a packet came from when it moves a packet into a new rdomain. This is why pf isn’t able to automatically route the RST back using the original rdomain. To avoid this you must filter your traffic on the input interface, before the incoming packet is marked with the new rdomain. And this would go for anybody using rdomain leaking, not just you.

And you’re right, the comment box was getting awfully tight. I changed some CSS around so things look much better now. Thanks.

Hi, nice article, thanks.
I try to set-up a dual ISP connections with rdomains, but failed to make the correct pf lines.
In http://www.openbsd.org/faq/pf/pools.html#outgoing example is used route-to and round-robin to achieve load balansing. What about rtables? Some working example?

Well, I make all settings, and I’m able to transmit and receive data between domains. But how to make something like:
“pass in on $int_if from $lan_net \
route-to { ($ext_if1 $ext_gw1), ($ext_if2 $ext_gw2) } \
round-robin”

with rtables?

pass in on vic1 to 0.0.0.0/0 rtable 0
pass out on vic0 nat-to vic0
pass in on vic1 to 0.0.0.0/0 rtable 1
pass out on vic2 nat-to vic2
How to combine theese two rules?

Keep in mind that rdomains aren’t designed to do what you’re doing. They’re meant to provide isolation at Layer 3 (and below). You’re trying to do round robin routing to balance Internet use. Stick with the ‘route-to’ method.

Actually I try to make load balancing of outbound connections and use two ISPs.
Maybe this citation misleaded me:
“This provides a much more elegant solution than the outbound load balancing example I wrote about in the PF User’s Guide.”

Ah, I can see why that would be misleading. I meant more in the sense that the pf ruleset is a bit cleaner and that you can do the dual-ISP setup using dynamically assigned Internet IPs with rdomains whereas with route-to, you’re pretty well stuck needing static IPs (because you have to specify the gateway IPs in the ruleset).

Do you perhaps have a example of overlapping subnets for rdomain 1 and 2?

in my case, I’ve setup vlan10 in rdomain 10 and vlan11 in rdomain 11 both have the same network 10.0.0.0/18 and same gateway ip of 10.0.0.1. I can ping both using the ping -V10/11 10.0.0.1 and also the hosts located on the 2 subnets. (vlan10/host = 10.0.0.4 and vlan11/host = 10.0.0.2)

in pf.conf – I’ve added the following,

pass in on vlan10 to 172.29.0.0/16 rtable 0
pass in on vlan11 to 172.29.0.0/16 rtable 0

This allows ping from both 10.0.0.2 and .4 hosts to my external interface em0 (172.29.43.239)

But now I would like to access the host 10.0.0.2 from a host 172.29.43.20 by accessing a natted IP of say 172.29.43.240->10.0.0.2 (rdomain 10).

Hey Joel,
I have a scenario with 2 routers (ospf+ipsec) and two different rdomains (there will be more soon).
However, just to make sure, is it possible to redistribute routes from all rdomains to the neighbor from a single ospfd instance?
I did not find any similar scenario to this so far.

Sure thing,
What I did was to set up the rdomains on one router only (eg. Router-1) and created correponding ospfd.conf files for such rdomains (eg ospfd0.conf).
On the edge router (Router-2) I have only the rdomain 0 propagating only the default route to the other one.

However I am stuck at a weird issue right now.
I’ve set up these multiple GRE interfaces on both (the ones for OSPF instances), and I have this weird behavior:

* When trying to ping the other end of the tunnel:
1 – The packet leaves the Router-1 from GRE interface (eg. gre0);
2 – The same packet arrives on Router-2 over another tunnel interface, which makes the traffic being blocked by uRPF or any other antispoofing rules in place.

* IPSec tunnel in in transport mode, only one GRE tunnel over this is ok, but more than one I am facing the issue stated above.

The IPSec tunnel is over a /30 link between the two (rdomain 0)
gre1 eg rdomain 1
gre2 eg rdomain 2 and so on…

When I have only one gre tunnel, it works perfect, when I put up the other one, only the last one which is enabled works.
What happens: pinging from R1 (ping -V1 x.x.x.x w.w.w.w), the packet leaves the correct gre interface but it arrives in the other end at gre2 for an example (both have uRPF enabled causing the packets to be blocked).

Are you using different outer IP addresses on the tunnel interfaces of R1? Other than that nothing obvious pops out but there’s still details I can’t see from what you’ve posted. I would try posting to the misc@ list and hope that someone else has done something similar.

So from this, it looks like R2 is seeing two tunnels coming from 10.0.0.1. Is that right? That could be it right there. R2 might be having trouble multiplexing the GRE traffic because the source address of each tunnel is the same.

I don’t really understand what you’re asking about haproxy. Can you try explaining it again? You say that you’re using haxproxy in r0 and it must listen in r0, r1, and r2 and you’re wondering how to make it listen on all r0 interfaces? *confused*

Your english is just fine :) Thank you for the diagram, that makes things much clearer. I’m not sure if it’s possible to do what you need. If haproxy doesn’t support being virtualized, then your only choice is to run an instance in each rdomain. I also looked at relayd(8) in OpenBSD but it doesn’t seem to support virtualization either. (On a side note, have you evaluated relayd?)

Sorry I never use relayd before . But when I examine relayd , I saw that backend balance algorithms types are less than Haproxy. Bytheway, I couldn’t find any document for “relayd performance vs haproxy performance”
:)

Do you know if there are any unexpected surprises with vlan interfaces and different rtables? I am planning to make a lab in some time, but for now I have no idea how it works when you separate some vlans from others at one physical interface.

Putting vlan interfaces into different rtables should be no different than doing it with physical interfaces. You should find that things behave exactly as I described in this post. The rdomains on vlan interfaces will be in effect even if the vlans are riding on the same physical interface. Remember, rdomains provide separation at Layer 3. Anything below that is irrelevant. Good luck with your lab!

“Virtualizing” /etc/resolv.conf is not supported, is it? How could I set different resolvers for each rdomain? The nameserver is not reachable from all rdomains so processes in those rdomains can’t resolve hostnames.

I am attempting to use rdomains to provide isolation for multiple redundant subnets for example rdomain 10 192.168.64.0/24 rdomain 11 192.168.64.0/24. That part I understand but then I want to do a one to one nat or binat with internface em0 rdomain 0 10.2.0.0/24. Rdomain 10 and 11 will have multiple servers each needing their own 10.2.0.0/24 public ip..
server 1 10.2.0.2 rdomain 10 192.168.64.2
server 2 10.2.0.3 rdomain 10 192.168.64.3
server3 10.2.0.4 rdomain 11 192.168.64.2
server 4 10.2.0.5 rdomain 11 192.168.64.3

rdomain 10 and 11 don’t need to be able to communicate with each other, but its ok if they do so on the rdomain 0. Here is what I have so far.

match out on em0 received-on vlan3020 from 10.2.0.2 nat-to 9.39.64.236

em0 is on the 10.2.0.0/24 network right? So this rule is trying to match traffic coming from 192.168.64.2 in rdomain 10 and going out em0 (in rdomain 0). I don’t understand the “from 10.2.0.2” part. Where is 192.168.64.2 getting NATed to 10.2.0.2? And then where does 9.39.64.236 come into play? It looks to me like this rule would never match because the source IP would never match 10.2.0.2.

Your right that statement is wrong let me try to explain. I am thankful for your response!

I need a true one to one nating to work both directions. I am going to have multiple rdomains which will have the same subnets ie the 192.168.64.0/24. I will have one public interface em0 for example 10.2.0.0/24. My logic problem is that in each of the rdomains 3020, 3021, 3025 … etc. I will have a 192.168.64.2 address which needs to be mapped to its 10.2.0.0/24 address. The same IP that you use to connect to the system from rdomain 0 needs to be the same IP that system uses to access systems on rdomain 0 side of the network. I don’t want to round robin or to just nat to the public interface of the em0:0. In order for this to work, I need to be able to say pass this traffic out to em0. Second I have to know the private IP that was used and which rdomain it came from so that you can nat to the correct public IP on em0. To further clarify there would also be a rdomain 3021 and there would be another private IP of 192.168.64.2. I would need to be able to distinguish between the two rdomains and private IPs so that I would know which public IP on em0 both should be nated to. To further complicate the situation there will also be multiple systems in each rdomain ie 3020 may have up to 10 systems so you can’t just say if you can from rdomain 3020 nat to this public IP on em0. You have to know which private IP it is even within the rdomain so that you can map it to its corresponding IP on em0.

so something like.
#server to egress –need help here
pass in on vlan3020 to 10.2.0.0/24 rtable 0
match out on em0 received-on vlan3020 from 192.168.64.2 nat-to 10.2.0.2
match out on em0 received-on vlan3020 from 192.168.64.3 nat-to 10.2.0.4

pass in on vlan3021 to 10.2.0.0/24 rtable 0
match out on em0 received-on vlan3021 from 192.168.64.2 nat-to 10.2.0.3
match out on em0 received-on vlan3021 from 192.168.64.3 nat-to 10.2.0.5

#em0 to private this part makes sense and will work
match in on em0 to 10.2.0.2 rdr-to 192.168.64.2 rtable 3020
match in on em0 to 10.2.0.4 rdr-to 192.168.64.3 rtable 3020
match out on vlan3020 nat-to vlan3020

match in on em0 to 10.2.0.3 rdr-to 192.168.64.2 rtable 3021
match in on em0 to 10.2.0.5 rdr-to 192.168.64.3 rtable 3021
match out on vlan3021 nat-to vlan3021

well it has be nated both directions. The problem with the above statements is that ‘match out on em0 received-on vlan3021 from 192.168.64.3 nat-to 10.2.0.5’ is not a valid statement and I can’t figure out the correct syntax to make that work. I was hoping that you might have an idea. It seems like it should be supported with all the other functionality they have built around rdomains. I have looked at the pf.conf man page and tried but haven’t figured it out yet hopefully It can be done and I will feel silly for struggling with it.

This is for a test environment where we don’t want to change the IPs of the systems. We also want to be able to monitor the traffic easily from those system across the network and be able to tell which system it is. If I nat to a single IP or round robin on em0 I lose a level of visibility.

so, I finally got the nat part to work, but now its still not working I am using pass rules. The symptom that I am seeing is that when I start the connection from the nat-ed side then it get’s nated and goes out and starts the connection to the external system, but on the response part of the request its using the same rule it went out on even though the rule is only an out only rule. I don’t really understand that. I have tried match rules, but haven’t gotten that to work either. If its a new connection it uses the correct rule and I am able to access the nat-ed system from an external system. I would like to understand this or is that a bug and if so do you know how to report it?

Great article! Thanks for putting this together. For me personally you are slowly becoming the go-to source for OpenBSD related questions. Keep up the good work.

I am trying to use the rtable functionality in a 2 ISP setup in which part of the network uses one ISP and the rest is using the second one. I am running 5.3 in a HA setup with carp interfaces. I got stuck pretty early in my tests. I was able to create the additional rdomain 1 route table but when I try to add interfaces to it I get:
ifconfig em0 rdomain 1
ifconfig: SIOCSIFRDOMAIN: Invalid argument.
Have you seen this type of error message before? Do you think is related to the fact that the interface I am trying to add to rdomain 1 has a carp interface associated with it? What would be the steps for migrating a non multi-routing table HA setup to a multiple rdomain HA setup?

Gave it a try with removing the carp interface and trying to associate the interface with rdomain 1 again but still no go. So I do not think the error message is related to the presence of the carp interface.

Virtual or physical shouldn’t matter since the rdomain goo is entirely inside the network stack and doesn’t touch the physical drivers at all. I can add my em(4) interface to an rdomain without issues on a VM.

Made some progress after destroying the carp interface and rebooting. I have now the em0 interface in rdomain 1. I also added one vlan interface to rdomain 1. I will keep at it and will give you an update once this is done, but at this point it looks like the reboot helped.

Hi Joel
I am trying to essentially combine a bunch of NAT routers into one.
I have traffic being routed to me on em0 (10.47.207.0/25)
And I have vlans on the inside em1 using the same IP range (172.16.4.0/22)

– Does this happen on both rdomain 1 and 2?
– Can you double check the pf.conf rules you pasted above; the one comment says “rdomain 2” but the rules are for rtable 1. It also refers to vlan42. Just make sure it’s an accurate copy/paste.
– Have you checked your whole pf.conf to see if there’s a rule that is blocking connections initiated from inside->outside?

Hi yeah I see that there are some copy paste errors.
I have made new copies on slexy that I know are right.
Since the same problem appears on both 42 and 43 I have included only the stuff related to 42.

As I said, I can RDP or SSH into the machine with 10.47.207.4 from the outside and it can ping hosts on the outside, but can not http, ftp or anything else.
There are no other rules or firewalls to block this, as the router itself has no problem fetching a file with FTP.

Is it possible to send traffic out different interfaces based on src IP in FreeBSD 10? I’ve been trying to accomplish this, but can’t seem to find the right commands using pf. Any tips would be much appreciated.

Hey Mike. I’m not familiar with all of FreeBSD’s bells and whistles in their network stack, but with pf you’ve got options like route-to and reply-to which can policy route traffic. There might be equivalents in ipfw too.

To be honest, I don’t understand the reason behind the terminology. The way I think about is:
– An rtable is a specific routing/ARP/ND table instance
– An rdomain is the collection of an rtable and the network interfaces that are bound together

The rdomain is the larger construct inside of which is an rtable. So far that works for me and helps me keep it straight.

Good morning, I work at Intituto Tecnologico de Tuxtla Gutierrez, Chiapas, Mexico, is an school & I have 4 links for go out, I work at Ing. en sistemas computacionales (one o 8 careers), I have an OpenBSD box with 6 network interfaces, one o these is with vlans, I want to declare 4 rdomains,
rdomain1 : bge0 –>nat–>rl0
rdomain2 : vlan13 –>nat–>em3
rdomain3 : vlan14 –>nat–>em1
rdomain4 : vlan15 –>nat–>em0

In addition I wanna that vlan14 has a proxy filter (squid) and We need in bg0 lan access some sites in em0 lan

Hi Jorge, I think for what you’re trying to do it would take more than a comment on a blog to get you going. You most likely need dedicated help from someone that can work through all the details of what you want to accomplish. Maybe you could try meeting at a local BSD user group and finding someone to help. Or maybe post an ad online for an OpenBSD consultant.

I think you could solve your issue my using rdomains, however I don’t think it’s necessary. You could simplify your config and get it working much easier than overlaying rdomains.

The issue you’re running into is an order of operations issue. The way the IP stack works is that ROUTING is first, NAT is after. Imagine a packet coming in on em4 from a server and going to the Internet. The first step is for the kernel to make a routing decision. You’ve got 3 default routes, but it appears the route going out em0 is being chosen. So the kernel puts the packet in the egress queue for em0 and runs the packet through pf again. _Now_ your NAT rule matches, since the packet is getting ready to send on em0. And since your NAT rule for em0 says to nat to .70, that’s what happens.

That’s the logic of what’s happening. To fix it, you could do a few different things but the easiest way in my opinion would be to simplfy how you’ve arranged your Internet interfaces.

Instead of having 1 IP per inteface, designate one interface for Internet and configure it with .70. Then add .71 and .72 as aliases on the same interface. This eliminates the routing complexity since you’ll now have exactly 1 egress interface to the Internet, all traffic (from servers and LAN) will hit that interface and you can NAT the traffic very easily with your nat-to rules.

Hi Joel,
I was wondering if you could help me out. I have an openbsd gateway at home and I’m trying to use rdomains to make all traffic going to the internet or coming from the internet route out of the openbsd gateway through a freebsd server for some packet inspection (bro) before getting routed back to the gateway and onto the final destination.

I was able to get it working by having the internet facing pppoe connection and a vlan interface pointing to the freebsd server in rdomain 1 and the freebsd server with a second vlan pointing to
rdomain 0 on the obsd gateway.

My problem is the openbsd gateway itself is exhibiting some odd behaviour. The gateway is serving dns via unbound and other clients on the network can get dns and access the internet etc. but the openbsd gateway itself can’t seem to resolve dns for itself. When logged into the gateway I can’t seem to ping my own interfaces by IP address (besides the loopback) however other hosts on the network can ping them.
internet <- pppoe0,rdomain1 -(obsd-gateway)-vlanX,rdomain1 freebsd vlanY,rdomain0 – (obsd-gateway) – vlan(a|b|c|etc)rdomain0
There are static routes and default routes in the corresponding rtables.

I could ping vlanX,rdomain1 interface from rdomain0 through freebsd, but that’s about it.
I tried going back to a flat single (rdomain0) routing table and realized I still couldn’t ping my own interfaces.
I thought I would reach out and see if you had any idea what could be going on regarding the pinging my own interfaces issue and the obsd gateway when using rdomains being unable to resolve dns for itself.

Hi Joel, thanks for getting back to me. I managed to get it working by fixing my pf.conf file, I think my pf.conf file is getting too complicated, a firewall rule overhaul is due. My gateway can resolve and ping out now though. However the second issue is still a problem, and it was a problem before I started using rdomains about the gateway pinging it’s own interfaces within the default rdomain. Anyways that’s not a big deal just weird.

However now that I’ve successfully migrated to this configuration my ipsec vpn seems to be broken. Before moving to rdomains it worked fine with a config like this:
ipsec.conf:

pf.conf:
pass in quick log on $if_extern inet proto udp from any to $gate_static1 port $svc_ipsec keep state
pass in quick log on enc0 inet proto udp from any to $gate_static1 port 1701 keep state (if-bound) tagged IPSEC_IKE1_IN
pass quick log proto { esp, ah } from any to any

where $if_extern is my internet facing interface, $gate_static1= and $svc_ipsec is a macro port group for 500 and 4500

Anyways this is a L2TP/IPSEC configuration (for myself and my wife’s android phones) which was working before I started using rdomains. Now that I’m using rdomains it’s not, from my reading it looks like it’s because my rdomain1 is my internet connection which is the interface ipsec was listening on. It looks like it is possible to have the daemon start in rdomain1 but the npppd daemon doesn’t play well with rdomains yet. I thought I would try and get around this by port forwarding from rdomain1 to an internal gateway IP on rdomain0 (going through the freebsd server). Anyways, I can’t seem to get this working. I suspect I’m not configuring it right. Before the ipsec tunnel was terminating on the edge gateway, now with rdomains it basically needs to terminate on the gateway on the internal rdomain (rdomain 0). I suspect this should be possible Do you know what changes to my config are needed to make it work? Thanks again,

I think you only need proto ipencap for tunnel mode, I’ve been doing transport mode. It worked before with udp as the proto, anyways it doesn’t look like the problem is there anyways if you see the debug log further below: here is my ipsec.conf config
ike passive esp transport \
from publicip (rdom0rfc1918IP/30) to any \
main auth “hmac-sha2-256” enc “aes” group modp1024 \
quick auth “hmac-sha2-256” enc “aes” \
psk “notmyrealpsk” \
tag IPSEC_IKE1_IN

and the pf.conf rules:
pass in quick log on $if_extern inet proto udp from any to $gate_static1 port $svc_ipsec rdr-to $gate_vlan3 keep state
pass in quick log inet proto udp from any to $gate_static1 port 1701 rdr-to $gate_vlan3 keep state (if-bound) tagged IPSEC_IKE1_IN
pass quick log proto { esp, ah } from any to any

If I look at the logs with a bit of debugging on I can see the flows are getting established, same if I do ipsecctl -s all
however if I run tcpdump -i enc0, or run the npppd daemon logging to stderr I don’t see anything, I don’t think the traffic is getting to the npppd daemon. Any ideas?
any ideas?

Looking over my post I see some portions of the configs I posted were filtered out so my ipsec.conf config looks broken, probably because of the angle brackets I was using. I had substituted the values for my static IP and psk, so the lines that refer to them do have values there, like in ipsec.conf:
from mystaticip to any \

Hi Joel,
In case you are still around to answer questions on this, do you happen to know any IPv6? I tried replicating the first part using IPv6, but my packets from rdomain 0 back to rdomain xx, seems to get dumped. Adding a route to localhost will just loop the packets there until the TTL reaches 0. I made a post on openbsd-misc describing the set-up, but without any luck so far: https://marc.info/?t=149917603400004&r=1&w=2

Interesting problem. I’m not totally surprised you’re seeing issues with IPv6 since the v6 stack gets far less exercise than the v4 stack. Unfortunately, the v6 stack does sometimes have different behavior or even latent bugs.

Out of curiosity, what version of OpenBSD is the router running?

What does ‘pfctl -vvss’ show for the icmp6 traffic at the time of the test?

Can you fiddle around with this route “route -T 0 add 2a01:7e8:35:fab::/64 ::1” and instead of pointing it to the loopback, point it to an interface, make it directly connected with -iface, and whatever else you can think of? The “time exceeded” message makes it look like the return traffic is looping inside the gateway.

I replicated the same issue myself. Given that my use case is policy based routing, I switched to the pf’s “route-to” directive, which works okay for IPv6.

Anyway, I tampered with some options like pf’s “no state”, and something behaved differently with that option, and some messages showed up in dmesg (sorry I forgot to save the details, but that’s not informative either). None of these options really solved the problem with rtable, however.

The biggest clue I find here is that I can see the packets leaving, @3 for jumping rtable, @4 for sending the packet on to “the internet”, and @4 again for messaging towards “the internet”. I cannot see anything that would indicate the return traffic (like having @1 triggered a whole lot more).

I tried adding a route to the rdomain 75 network via the em1 interface address (it’s a /126, so I don’t really have a lot of wriggle room):
# route -T 0 add 2a01:7e8:35:fab::/64 -iface 2a01:7e8:1:800::2fe

Are you saying this worked prior to 6.1? There was an awful lot of work that happened in the network stack leading up to 6.1 (and the work continues for 6.2). So yeah, maybe this is a regression. I wonder if this still works for v4? I have a box here doing v4 inter-rdomain routing but I had trouble upgrading it to 6.1 so it’s still on 6.0.

Haven’t tried it with a version prior to 6.1, but I did a v4 version and that worked fine (on the same set-up). Actually, I didn’t need the return route on rdomain 0 on v4 for it to work, which surprised me a bit, but if a lot of work has been done, that could be why.

I would follow up with your post on misc with this information: v4 works just fine on 6.1 but v6 doesn’t with the equivalent rule set. Provide a copy ‘netstat -rnf inet6’ and the output of your tcpdump on lo0 that shows the return traffic looping around. Maybe one of the v6 guys will be able to take a look.

So I’ve written a follow up mail on misc, but there haven’t been any reply. For now I have taken rdomains out of the equation, as IPv6 is more important. Thank you for all your help, I hope I’ll get it working in the future!

Hi Joel,
is there a solution for swapping packets between rdomains on a bridge ?
I have tried but doesn’t work, different pf rules. Only when the device is acting as a router including nat and so on.
Setup: 2x phy interfaces and 1x vether on bridge0 in rdoamin 0
1x phy with ip in rdomain 2, no bridge simple interface. ipforwarding enabled.

Now I want to map some traffic ( for testing 8.8.8.8) on bridge0/rdomain 0 and send it out on rdomain 2.
Any ideas ?

..based on the pf.conf doc the behavor makes sense:
rtable
Used to select an alternate routing table for the routing lookup. Only effective before the route lookup happened, i.e. when filtering inbound.

> but on bridge mode we do not make a route lookup, in routing we do.
adding a rule …route-to 192.168.0.1 rtable 2 isn’t a solution.

Hello Joel,
I have changed the design back to simple “routing” and it works for user traffic entering the device and forward it to the right rdomain.
However, how can I catch self initiated traffic from the router ?
I do not have and don’t need a default GW. It looks like that pf rules doesn’t match/executed because there is no route.
maybe we can swap to mail and only post the solution later .

I don’t think I understand. You don’t need a default route? Do you have specific routes installed for whatever destination(s) you need to reach?

I’m also confused why you’re trying to “catch” traffic coming from the router. Are you trying to use pf to redirect that traffic to a non-default rdomain? Why aren’t you either configuring the daemon to use the right rdomain or using ‘route -T exec’?

If you want to switch to email, my address is on the contact page (link at the top of the page).