I'd like to ask all of those who have serious in-depth knowledge of how the OpenBSD kernel works and how pf works to answer a possibly simple, possibly complicated question.

Basically what I want to know is how the multipath routing in the OpenBSD kernel works. Is it meant to work for multiple routers available on the same subnet or can they be on separate subnets? I am trying to use only 1 NIC for egress traffic to the Internet. Both routes are actually 2 separate ISP connections (say cable and dsl) over a single NIC with both lines connected to it via a switch and ISP1's IP assigned to the NIC as the primary IP and the IP of ISP2 aliased onto the same NIC and then using multipath default routes to define the default gateways of each ISP connection.

What I'm basically trying to do it load balance 2 Internet connections with my OBSD firewall. But with simple multipath routes it sort of works, but I suffer a whole heck of a lot of packet loss.

What I ultimately found using line snooping Ethernet sniffers is that the firewall would sometimes attempt to send traffic from one ISP's IP address to the default gateway of the other ISP's connection. I assumed this would not happen since each ISP connection has a small 5 IP subnet and I did specify the netmask and the default gateway for each ISP is within its subnet. I made the seemingly logical assumption that it would intelligently route traffic from ISP1's IP to ISP1's default gateway and ISP2's IP to ISP2's default gateway. But it doesn't seem to be doing that, picking a default gateway seemingly at random without taking into consideration the source IP its employing at the moment.

Why is this? Am I misunderstanding what multipath routes are meant for? Or is this a bug in the kernel/my configuration of multipath?

I found a seemingly slightly better working solution with just using Route-to rules in pf. I now no longer suffer from packets going down the wrong default gateway, but I only have the default gateway for ISP1's connection defined for this to work right.

Is multipath routing only meant to work for routers on the same subnet?

I am trying to use only 1 NIC...and ISP1's IP assigned to the NIC as the primary IP and the IP of ISP2 aliased onto the same NIC

Alias IP addresses are not designed for routing. They merely give the NIC multiple IP addresses to respond to. But they will always respond with their "real" address as the origination.

That will -usually- work with stateful connection protocols like TCP, as there are TCP sequence numbers, but will of course fail with any stateless connections -- UDP and ICMP being the two that come immediately to mind.

(I discovered this with UDP on some aliased addresses years ago.)

Add another NIC, and use two separate subnets for Multipath routing. Avoid alias IP addresses except for TCP-only activity, such as web servers. FAQ 6.14 shows an example of equal cost routing -- using two interfaces.

Now I'd like to understand with 2 NICs installed for egress traffic, 1 for each ISP connection. With multipath default routes to each ISP's respective default gateway next hops, why is it that the box only uses the secondary ISPs connection exclusively?

If I issue a $ping 8.8.8.8

The ping times are appropriate for the usual latency of the secondary connection. This is confirmed by issuing a $traceroute 8.8.8.8 and watching traffic go down that route.

If I issue a $ping -I {IP of the primary connection} 8.8.8.8

The pings work, and have the usual latency that can be expected on that line. This can also be confirmed by issuing a $traceroute -s {IP of the primary connection} 8.8.8.8 and watching the traffic go down the primary line this time.

Then to really make matters worse, if I issue a $ping -I {IP of the secondary connection} 8.8.8.8
This fails....as does $traceroute -s {IP of secondary connection} 8.8.8.8

Why? It should work if it works when not specifying any address. And when not specifying any address, shouldn't it pick a connection to use at random or using a round-robin method with multipath routing enabled, or am I misunderstanding something again?

It appears you did not follow FAQ 6.14, or did not follow it correctly. Note the section on sysctl settings.

From route(8), highlight mine:

Quote:

The optional -mpath modifier needs to be specified with the add command
to be able to enter multiple gateways for the same destination address
(multipath). When multiple routes exist for a destination, one route is
selected based on the source address of the packet. The sysctl(8)
variables net.inet.ip.multipath and net.inet6.ip6.multipath are used to
control multipath routing. If set to 1, multiple routes with the same
priority are used equally; if set to 0, the first route selected will be
used for subsequent packets to that destination regardless of source.

It appears you did not follow FAQ 6.14, or did not follow it correctly. Note the section on sysctl settings.

From route(8), highlight mine:

Right, I did this...hence my confusion...I did read the that. What it conveniently leaves out is if this is meant to have multiple routes defined for a single source IP, or does it still work if you have more than one IP with default gateways each to the same basic (Internet) destination.

I believe your carp(4) implementation might be getting in the way. As configured, carp is not being utilized for load balancing.. You might consider using real interfaces. See if it makes a difference.

If not, or for some real help instead of my fumblings, consider posting to misc@.

Next time, please enclose such pasted content in code tags. It makes things like routing tables much easier to read. As posted, I'm having trouble with it.

Sorry about not using the code tags...not familiar with that. Will have to spend some time on the forums guides to learn how to do this.

As for the carp interfaces, I'm using them between to similarly configured boxes to not only load balance 2 Internet connections, but also to be redundant firewalls of each other. So technically they are being used. If they are working the way I expect...that's another thing. What makes you say they aren't being used, as is currently configured?

Oh I apologize...I misunderstood. I am not trying to use carp for any load balancing. Just for fail-over...should one firewall die/get disconnected/administratively shutdown for whatever reason...the other takes over. This does work as expected..am quite happy with this.

I'm attempting to use Equal Cost Multipath routing to load balance traffic from the Internal LAN (and the firewall itself I would assume) between the two Internet connections. This is what is not working.

I would assume that each server at each end of a carp cluster must have the same routing tables.

If they do, then you have exceeded my capabilities to assist you further. Since no one else has jumped in, I recommend reaching out the the broader support community at the misc@ mailing list. If you have not used it before, please read http://www.openbsd.org/mail.html before posting there.