Changing ARP timers in Cumulus Linux

Environment

Cumulus Linux, version 3.3.2 and earlier

Background

Since the release of the Linux 2.2 kernel, the Address Resolution Protocol (ARP) for IPv4 has been aligned with the behavior of the Neighbor Discovery Protocol of IPv6. As a result, instead of relying upon the MAC address learned for a neighbor for a fixed interval, the Linux kernel transitions neighbors through multiple states (none, incomplete, reachable, stale, delay, probe), depending upon whether it knows about the neighbor, has recently seen traffic from the neighbor, or needs to ARP again for the neighbor.

A successful ARP response places a neighbor in a reachable state and allows the kernel to directly forward packets to it. Neighbors are kept in a reachable state based upon the kernel receiving traffic from them*. If no traffic is received, a neighbor will transition out of the reachable and into a stale state after a random number of interval between [base_reachable_time_ms/2] and [3*base_reachable_time_ms/2]. The default base_reachable_time_ms varies by version:

In Cumulus Linux 3.4.0 and later, it is 1080000 milliseconds (18 minutes)

In Cumulus Linux 3.0.0 through 3.3.2, it is 14400000 milliseconds (4 hours)

As a switch running Cumulus Linux is a network device, it does not often interact directly with end systems as much as end systems interact with one another. As a result, after a successful ARP places a neighbor into a reachable state, Cumulus Linux may not interact with the client again for a long enough period of time for the neighbor to move into a stale state. To keep neighbors in the reachable state, Cumulus Linux has added a background process (/usr/bin/neighmgrd) that tracks neighbors that move into a stale, delay or probe state and will attempt to refresh their state ahead of any removal from the Linux kernel, and thus before it would be removed from the hardware forwarding.

Resolution

If a longer reachability time is preferred, then you should modify base_reachable_time_ms. If an ARP timeout of at most 30 minutes is preferred, then setting base_reachable_time_ms to a value of 1200000 is suggested. This results in moving a given neighbor's ARP state from reachable to stale at between 600000 and 1800000 milliseconds (10 and 30 minutes) of age.

To temporarily set a new reachable timer, change the system base and active interfaces:

This change does not affect active interfaces until the system has been restarted, so it should be used in conjunction with the temporary example above if rebooting the Cumulus Linux switch is not possible.

* While "forward progress" is another method, it is more applicable to clients maintaining the state of a router. Because a Cumulus Linux node normally is the router, its applicability here is extremely limited.

Modifying the Garbage Collection Threshold

During testing you may need to modify the garbage collection threshold. The kernel does not advance the state of an ARP entry unless there are gc_thresh1 entries in the table. The default value of gc_thresh1 is 128, and you want this value to be lower than the number of ARP entries; otherwise, ARP entries will not move from REACHABLE to STALE.

For example, if your ARP table has 5 entries, set gc_thresh1 to 4:

echo "4" > net.ipv4.neigh.default.gc_thresh1

The meanings of the gc_threshX sysctls are:

gc_thresh1: The minimum number of entries to keep in the ARP cache. The garbage collector will not run if there are fewer than this number of entries in the cache.

gc_thresh2: The soft maximum number of entries to keep in the ARP cache. The garbage collector will allow the number of entries to exceed this for 5 seconds before collection will be performed.

gc_thresh3: The hard maximum number of entries to keep in the ARP cache. The garbage collector will always run if there are more than this number of entries in the cache.