Having some issues using lagg on a Dell server with FreeBSD 7.0. Trying to set up the two onboard NICs in lagg failover mode. I can get the lagg configuration to load via rc.conf and everything on the surface to looks to be loading correctly. The configuration can be seen using ifconfig and I have a lagg interface with two NICs assigned to it. Yet when I go to simulate a failure by either uplugging the master/active NIC or downing the switchport its connected to, I lose connectivity to the box. If I do an ifconfig and run ping from the bsd server to net, the failover seems to work. Obviously not a good solution. I first would like to see if anyone can see something I am doing wrong in my configuration. The following is whats contained in my rc.conf file with regards to lagg:

Has anyone successfully gotten lagg to work in FreeBSD 7.0? Is there something else required to get this to work correctly I am missing? It really seems like the OS just cannot failover the NICs. Perhaps a missing module or something? Thanks.

I've just started playing with carp(4) and lagg(4) in virtual machines. So far we've found a few problems with the way carp works in 7.0 and Max Laier is working on it. Haven't got to the lagg testing as yet, but I'll post if we get it working or run into the same issues.

I too have the same problem. Whenever I test lagg failover, it will not "failover" unless you issue "ifconfig" in the console at least once and ping to any host from the machine, then the lagg will work instantaneously.. however you will notice that if you use another workstation to ping the lagg machine it be up for a little while but eventually the ping will die out.

sequence:
1. ifconfig after removing bce0, it will still be OK
2. after placing back bce0, it will still be OK
3. after detaching bce1, it will get stuck.. placing it back requires "ifconfig" and "ping"
(and ping will respond for a while)
4. and the process repeats for whichever NIC you will detach..

conclusion:
- failover will only happen ones
- succeeding failover will require ifconfig and ping.. and is not reliable.

OK been some time but I might have figured out what is going on here. I moved to FreeBSD 8.1 Release on a Dell blade server with internal switches, and Lagg is working better but not fully. I am not sure if this is a BSD issue or a problem with our switches but this issue could crop up in certain environments as it does in mine. I have two switches btw. One LAGG nic goes to one switch, and the other LAGG nic goes to a secondary switch to be able to recover from a switch failure as well as a nic/link failure. These switches are then uplinked into our core switches.

It seems that when operating in failover mode lagg does not have any mechanism to tell the network that the MAC for the LAGG interface has moved. It needs to be able to pre-populate the cam tables in the switch it is connected to when a failover occurs. This would be required for the rest of the network to see the layer 2 address (MAC) has moved and can now be reached on a different port/location on the network. Without this mechanism, the layer 2 topology no longer sees the MAC address of the LAGG interface at all as the default behavior for Cisco switches anyway is to flush the cam table entries associated with any port that is down. Upstream switches send traffic to the last recorded entry in their cam table. But that entry is now missing on the downstream switches connected to the LAGG host. This results in the LAGG host being unavailable from the rest of the network.

What I see is that the once I issue a ping from the BSD server out to network, the cam table entries are repopulated in the switches and the host again becomes reachable from the rest of the network. On Cisco switches, the default CAM table cache timeout is 5 minutes which results in a LAGG enabled host being inaccessible for a full 5 minutes until the MAC table entry for the LAGG host times out. This obviously is not a very attractive scenario.
I think this is a flaw in the way LAGG operates. I think some of the windows drivers for doing NIC teaming issue a Gratuitous arp to repoulate the cam entries upon a failure of a primary NIC. I think LAGG should do something similar.

Perhaps someone has a way to fix this issue (aside from running a ping cron or something equally silly to get this to work) or has some experience setting this up differently. I would prefer not to have to lower the cam timers on my switches for this to work. Any help is appreciated.