Hello,
I have what may be an ever increasing situation, which is possibly
solvable with some modifications to the balance-alb mode of the bonding
driver. I am using motherboards with dual GigE NICs and cheap switching
equipment which only supports a few trunks (hence making modes 0 and 2
practically unusable). If I understand the documentation, the balance-tlb
and balance-alb use the original hardware (HW) addresses of the slaves,
and frames are sent to other clients based on whichever is the least used
slave. To balance recieving, each client randomly gives the other clients
the HW address of one of its slaves.
Using this method the balancing is achieved when one machine communicates
with #-slave others, otherwise fewer than all of the interfaces are
used. Also, this method does have an unfortunate result that starting
machines using TFTP becomes difficut since TFTP in most boot loaders (all
that I know) is ARP based.
What I propose is to create a new mode which is a combination of
balance-alb and balance-rr. In this new mode the bonding interface keeps
a full list of all HW addresses for each IP in the ARP table. The
selection of slave and destination can then follow thusly:
1) (MOD 2) For the most simple case (and fastest decision making) of dual
GigE adapters (which are usually sequentially numbered) the decision can
be as simple as odd to odd and even to even.
eth0 <-> eth0
eth1 <-> eth1
2) (LO-HI) For non-sequential (and not necessarily equal numbers of
adapters) the ARP tables could be sorted numerically, and then the lowest
number slave transmits to the lowest number address, the next highest
slave to the next highest address, and so on.
eth0? <-> eth0? (if it's the lowest)
eth1? <-> eth1? (if it's the highest)
eth2 <-> nothing (isn't used)
3) (LO-HI-RR) Like LO-HI, except that the rank of the slave and the rank
of the destination cycle through the entire ARP cache.
eth0 -> eth0
eth1 -> eth1
eth2 -> eth0
eth0 -> eth1
eth1 -> eth0
eth2 -> eth1
Of course, the benefit of method 1 is that it's quick and simple for the
simple (and common) case of two adapters. Methods 1 and 2 also benefit in
that each slave always communicates to the same slave on each client,
aiding in things like network boots and other communications to non-bonded
nodes. The benefit to all of these is that no special switching equipment
is needed, and the full aggregate speed can be realized for
machine-to-machine communication.
The only other way to do this is with IP-layer bonding using some modified
version of the Equalizer driver. I think it was called ipeql, but it was
for a 2.1 kernel and hasn't been maintained.
--
------------------------------------------------------------
Anthony Ciani (aciani1@...)
Computational Condensed Matter Physics
Department of Physics, University of Illinois, Chicago
http://ciani.phy.uic.edu/~tony
------------------------------------------------------------