Load balancing between two routers with two FE links

Need a sanity check on this one - the layout is fairly simple - a router on each side (speaking bgp, full table), GigE link on each side to a low end Cisco switch, two FastE links between the switches. Currently GigE is not available between the two switches (cisco 3548 and 3560).

We initially thought this would be easy - setup etherchannel and be done. No such luck, as the 3548 only supports per-mac load balancing, which means all we really get is redundancy since there's only one mac address on either side of the link.

We're now looking at some silly hackish stuff like ebgp over two distinct paths or simply setting up two sessions and letting the router load balance based on there being equal cost paths. Not really thrilled (or terribly familiar) with those options.

Not sure I understand the design but if you have 2 routers with 'internet connections' taking full BGP internet route tables from their upstream neighbors, then on the back end you need either an iBGP session between them or some other routing protocol. How you connect the routers to accomplish that is up to you and your performance needs. If the routers have a 10 megabit internet link, for example, then 100 megabit between them is fine. If they have gigabit links to the internet, then 100 megabit between them is not sufficient because you may find traffic going from one router to the other to reach the internet because of path decisions (AS numbers that are 'closer' on the one router vs. the other).

We do what you describe with 2 routers that face the internet connected to 2 core routers that then connect to all the internal routers and distribution switches. The edge routers and core routers are connected with layer 3 links so everything is a routing decision made/updated in BGP (for keeping BGP tables for announcements etc.) or OSPF for our own network area (for the network of routers etc.).

We're now looking at some silly hackish stuff like ebgp over two distinct paths or simply setting up two sessions and letting the router load balance based on there being equal cost paths. Not really thrilled (or terribly familiar) with those options.

Since you're talking full tables, this is pretty much your only option to guarantee two paths:

RTR---SW===SW---RTR

Split the RTR-SW and SW-RTR links into two VLANs using 802.1q encapsulation and create a trunk port on the switch with two VLANs (let's say 10 and 11). Then, for the SW=SW link, put one link into VLAN10 and the other link into VLAN11. Build two BGP peers -- one across each VLAN -- and set up BGP multipath (maximum-paths xxxx) accordingly.

If you Etherchannel and split with two sessions, you're still probably going to end up with all the traffic polarizing across a single link since the same MACs will be in play.

Depending on your IP space layout and what those routers are doing, you could potentially still enable L3 across the switches, let CEF load sharing do its thing for the transiting traffic, and just let iBGP pass through the router. It'd really depend on the overall network topology and the function of the devices to know if that would work and how to set it up, though, since the transiting traffic through those switches would have to be directed by something other than BGP (an IGP or statics).

Not sure I understand the design but if you have 2 routers with 'internet connections' taking full BGP internet route tables from their upstream neighbors, then on the back end you need either an iBGP session between them or some other routing protocol.

Just to clarify, Uhlek's diagram is correct. Single router on each end, the folks on the other end of that link are providing internet transit, and our constraint is that there are no GigE ports available between us...

We're now looking at some silly hackish stuff like ebgp over two distinct paths or simply setting up two sessions and letting the router load balance based on there being equal cost paths. Not really thrilled (or terribly familiar) with those options.

Since you're talking full tables, this is pretty much your only option to guarantee two paths:

RTR---SW===SW---RTR

Split the RTR-SW and SW-RTR links into two VLANs using 802.1q encapsulation and create a trunk port on the switch with two VLANs (let's say 10 and 11). Then, for the SW=SW link, put one link into VLAN10 and the other link into VLAN11. Build two BGP peers -- one across each VLAN -- and set up BGP multipath (maximum-paths xxxx) accordingly.

This is exactly what was suggested by our upstream. I understand the config, but I'm still very weak on how the actual load balancing happens. I understand that I'll see two equal cost routes to every destination our upstream advertises, but I don't understand how the forwarding decision is made. Is this going to be per-packet, per-destination?

This is exactly what was suggested by our upstream. I understand the config, but I'm still very weak on how the actual load balancing happens. I understand that I'll see two equal cost routes to every destination our upstream advertises, but I don't understand how the forwarding decision is made. Is this going to be per-packet, per-destination?

It will be per-destination unless the device can support per-packet and you configure it to. Even if it does, though, you don't want per-packet because that causes all kinds of problems with out-of-order packets.

As far as what "per destination" means, that depends on the precise type of device and configuration. In general, though (I know CEF works this way on Cisco devices, not sure about other vendors) you're going to get a hash generated on at least the source and destination IP and possibly also the ports and protocols. That hash will determine the egress port to be used for that particular flow. That guarantees that a single flow will always use a single link in each direction (it may be asymmetric for the return traffic, but that's no big deal).

Ah, you can ignore my suggestion about GLBP, since it's not applicable.

I'd do what Uhlek has suggested in his last post. Depending on traffic flow, you can alter the hash mechanism to balance on various tags as necessary to get somewhat even distribution (if that's what you are looking for?).

I don't think you mentioned the model of router but depending on that, it may not be using the main DRAM at all for the BGP tables. Though another peer session certainly takes up some system memory. It really is nice that Cisco has stepped up the memory capacity a bit on their newer generation stuff. Taking a full BGP internet tables on stuff that is just a few years old isn't even possible in some cases anymore.

So are these routers in different ASes? There's really nothing to this, just set up two eBGP sessions the same way you set up one, but on two separate VLANs over the separate FE links, then have maximum-paths 2 in the BGP config.

If they're in the same AS use only one iBGP session (using loopbacks) but run OSPF or some such on the two VLANs between the routers and give the VLAN interfaces equal cost and you're in business.

So are these routers in different ASes? There's really nothing to this, just set up two eBGP sessions the same way you set up one, but on two separate VLANs over the separate FE links, then have maximum-paths 2 in the BGP config.

That's what we did and it's going well (note the scale differs on these graphs, also I care more about balance inbound than outbound):

Did I mention the building is one of NYC's largest "carrier hotels"? Apparently it's not like that in the whole building, but where we are the colo operation apparently decides who can and can't reach their customers (they're a Tier 1 with a "3" in their name).