Detecting and Mitigating a BGP Slow Peer

The BGP Slow Peer feature allows a network administrator to detect a BGP slow peer and also to configure a peer as a slow peer statically or to dynamically mark it.

BGP slow peer detection identifies a BGP peer that is not transmitting update messages within a configured amount of time. It is helpful to know if there is a slow peer, which indicates there is a network issue, such as network congestion or a receiver not processing updates in time, that the network administrator can address.

BGP slow peer configuration moves or splits the peer from its normal update group to a slow update group, thus allowing the normal update group to function without being slowed down and to converge quickly.

Finding Feature Information

Your software release may not support all the features documented in this module. For the latest caveats and feature information, see
Bug Search Tool and the release notes for your platform and software release. To find information about the features documented in this module, and to see a list of the releases in which each feature is supported, see the feature information table at the end of this module.

Use Cisco Feature Navigator to find information about platform support and Cisco software image support. To access Cisco Feature Navigator, go to
www.cisco.com/​go/​cfn. An account on Cisco.com is not required.

Information About Detecting and Mitigating a BGP Slow Peer

BGP Slow Peer Problem

BGP update generation uses the concept of update groups to optimize performance. An update group is a collection of peers with the identical outbound policy. When generating updates, the group policy is used to format messages that are then transmitted to the members of the group.

In order to maintain fairness in resource utilization, each update group is allocated a quota of formatted messages that it keeps in its cache. Messages are added to the cache when they are formatted by the group, and they are removed when they are transmitted to all the members of the group.

A slow peer is a peer that cannot keep up with the rate at which the Cisco IOS software is generating update messages, and is not keeping up over a prolonged period (in the order of a few minutes). There are several causes of a peer being slow:

There is packet loss or high traffic on the link to the peer, and the throughput of the BGP TCP connection is very low.

The peer has a heavy CPU load and cannot service the TCP connection at the required frequency.

When a slow peer is present in an update group, the number of formatted updates pending transmission builds up. When the cache limit is reached, the group does not have any more quotas to format new messages. In order for a new message to be formatted, some of the existing messages must be transmitted by the slow peer and then removed from the cache. The rest of the members of the group that are faster than the slow peer and have completed transmission of the formatted messages will not have anything new to send, even though there may be newly modified BGP networks waiting to be advertised or withdrawn. This effect of blocking formatting of all the peers in a group when one of the peers is slow in consuming updates is the "slow peer" problem.

Temporary Slowness Does Not Constitute a Slow Peer

Events that cause large churn in the BGP table (such as connection resets) can cause a brief spike in the rate of update generation. A peer that temporarily falls behind during such events, but quickly recovers after the event, is not considered a slow peer. In order for a peer for be marked as slow, it must be incapable of keeping up with the average rate of generated updates over a longer period (in the order of a few minutes).

BGP Slow Peer Feature

The BGP Slow Peer feature provides you, the network administrator, with three options:

You can configure BGP slow peer detection only, which will simply detect a slow peer and provide you with information about it. Such detection is a key feature, especially in a large network of BGP peers, because you can then address the network problem that is causing the slow peer.

You can configure a dynamic BGP slow peer. When such slow peer protection is configured, slow peer detection
is enabled by default. The slow peer is moved or "split" from its normal update group to a slow update group, thus allowing the normal update group to function without being slowed down, and to converge more quickly than it would with the slow peer. You have the choice of whether to keep the slow peer in that slow update group until you clear the slow peer (by specifying the permanent keyword), or allow the slow peer to dynamically move back to its regular update group as conditions improve. We recommend that you use the permanent keyword and resolve the network issue before you clear the slow peer status.

You can configure a static BGP slow peer if you already know which peer is slow, perhaps due to a link issue or slow CPU process power. No detection is necessary, and it is more likely that the slow peer will remain there, hence the static configuration.

BGP Slow Peer Detection

You can choose to detect a BGP slow peer, whether or not you also configure the slow peer to be moved to a slow peer update group. Simply detecting a BGP slow peer provides you with useful information about the slow peer without splitting the update group. You should then address the network problem causing the slow peer.

Timestamp on an Update Message

BGP slow peer detection relies on the timestamp on the update messages in an update group. Update messages are timestamped when they are formatted. When BGP slow peer detection is configured, the timestamp of the oldest message in a peers queue is compared to the current time to determine if the peer is lagging more than the configured slow peer time threshold.

For example, if the oldest message in the peers queue was formatted more than 3 minutes ago, but the BGP slow peer detection threshold is configured at 3 minutes, then the peer that formatted that update message is determined to be a slow peer.

The Cisco IOS software generates a syslog event when a slow peer is detected or recovered (when its update group has converged and it has no messages formatted before the threshold time).

Benefit of BGP Slow Peer Detection

Slow peer detection provides you with information about the slow peer, and you can resolve the root cause without moving the peer to a different update group. Therefore, slow peer detection requires just one command that helps you identify something in your network that could be improved.

Benefits of Configuring a Dynamic or Static BGP Slow Peer

When a slow peer is present in an update group, the number of formatted updates pending transmission builds up. New messages cannot be formatted and transmitted until the backlog is reduced. That scenario delays BGP update packets and therefore delays BGP networks from being advertised. The problem can be resolved or prevented by configuring a dynamic slow peer or a static slow peer. Such configuration causes a slow peer to be put into a new, slow peer update group and thus prevents the slow peer from delaying the BGP peers that are not slow.

Static Slow Peer

If you believe that a peer is slow, you can statically configure the peer to be a slow peer. A static slow peer is recommended for a peer that is known to be slow, perhaps having a slow link or low processing power.

Static slow peer configuration causes the Cisco IOS software to create a separate update group for the peer. If you configure two peers belonging to the same update group as slow, these two peers will be moved into a single slow peer update group because their policy will match. The slow update group will function at the pace of the slowest of the slow peers.

A static slow peer can be configured in either of two ways:

At the BGP neighbor (address family) level

Via a peer policy template

You probably want to determine the root cause of the peer being slow, such as network congestion or a receiver not processing updates in time. A static slow peer is not automatically restored to its original update group. You can restore a static slow peer to its original update group by using the noneighborslow-peersplit-update-groupstaticcommand or thenoslow-peersplit-update-groupstaticcommand.

Dynamic Slow Peer

An alternative to marking a static slow peer is to configure slow peers dynamically, based on the amount of time that the timestamp of the oldest message in a peers queue lags behind the current time. The default threshold is 300 seconds, and is configurable. We recommend that you specify the optional permanent keyword, which causes the peer to remain in the slow peer group while you resolve the root cause of the slow peer. You can then use the clearbgpslow command to move the peer back to its original group.

If you do not configure the permanent keyword, the peer moves back to its original group if and when it regains its non-slow functioning.

When a dynamic slow peer is configured, detection is enabled automatically.

You can configure dynamic slow peers in three ways:

At the address family view level

At the neighbor topology (that is, neighbor address-family) level

Via a peer policy template

How to Detect and Mitigate a BGP Slow Peer

Detecting a Slow Peer

You might want to just detect a slow peer, but not move the slow peer out of its update group. Such detection notifies you by way of a syslog message that a BGP peer is not transmitting update messages within a configurable amount of time. The peer remains in its update group; the update group is not split. The syslog message level is notice level for both detection and recovery.

(Optional) Adds an entry to the BGP or multiprotocol BGP neighbor table.

This step is required if you intend to disable dynamic slow peer protection for a specific peer as shown in Step 7 below.

Step 5

address-familyipv4

Example:

Router(config-router)# address-family ipv4

Enters address family configuration mode.

Step 6

bgpslow-peerdetection[thresholdseconds]

Example:

Router(config-router-af)# bgp slow-peer detection threshold 600

Configures global slow peer detection and specifies the time in seconds that the timestamp of the oldest update message in a peers queue can be lagging behind the current time before the peer is determined to be a slow peer.

The range of the threshold is from 120 to 3600. As long as the command is configured, the default is 300.

Configuring Dynamic Slow Peers at the Address-Family Level

Configuring dynamic slow peers at the address-family level applies to all peers in the address family specified. (If you want to configure specific
slow peers, perform this task at the neighbor level or by using a peer policy template.)

The last step is optional; perform it only if you want to disable slow peer protection for a specific peer.

If a static slow peer update group exists (because of a static slow peer), the dynamic slow peer will be moved to the static slow peer update group.

If no static slow peer update group exists, a new slow peer update group will be created and the peer will be moved to that.

We recommend using the permanent keyword. If the permanent keyword is used, the peer will not be moved to its original update group automatically. After you determine the root cause of the slowness, such as network congestion, for example, you can use a clearbgpslowcommand to move the peer to its original update group. See the Restoring Dynamic Slow Peers as Normal Peers to move a dynamically slow peer back to its original update group.

If the permanent keyword is not used, the slow peer will be moved back to its regular original update group after it becomes a normal peer (converges).

If a static slow peer update group exists (because of a static slow peer), the dynamic slow peer will be moved to the static slow peer update group.

If no static slow peer update group exists, a new slow peer update group will be created and the peer will be moved to that.

We recommend using the permanent keyword. If the permanent keyword is used, the peer will not be moved to its original update group automatically. After you resolve the root cause of the slowness, such as network congestion, for example, you can use a clearbgpslow command to move the peer to its original update group. See the Restoring Dynamic Slow Peers as Normal Peers to move a dynamically slow peer back to its original update group.

If the permanent keyword is not used, the slow peer will be moved back to its regular original update group after it becomes a normal peer (converges).

Configuring Dynamic Slow Peers Using a Peer Policy Template

Perform this task to configure a BGP slow peer using a peer policy template.

If a static slow peer update group exists (because of a static slow peer), the dynamic slow peer will be moved to the static slow peer update group.

If no static slow peer update group exists, a new slow peer update group will be created and the peer will be moved to that.

We recommend using the permanent keyword. If the permanent keyword is used, the peer will not be moved to its original update group automatically. After you determine the root cause of the slowness, such as network congestion, for example, you can use a command to move the peer to its original update group. See the Restoring Dynamic Slow Peers as Normal Peers to move a dynamically slow peer back to its original update group.

If the permanent keyword is not used, the slow peer will be moved back to its regular original update group after it becomes a normal peer (converges).

Restoring Dynamic Slow Peers as Normal Peers

Once you, the network administrator, resolve the root cause of a slow peer (network congestion, or a receiver not processing updates in time, and so forth), use the clearcommands in the following task to move the peer back to its original group. Both commands perform the same function.

Note

Note that statically
configured slow peers are not affected by these clear commands. To restore a statically configured slow peer to its original update group, use the no form of the command shown in one of the tasks in the Marking a Peer as a Static Slow Peer.

SUMMARY STEPS

1.enable

2.clearipbgp{[af]*|neighbor-address| peer-groupgroup-name}slow

3.clearbgpaf{*| neighbor-address|peer-groupgroup-name}slow

DETAILED STEPS

Command or Action

Purpose

Step 1

enable

Example:

Router> enable

Enables privileged EXEC mode.

Enter your password if prompted.

Step 2

clearipbgp{[af]*|neighbor-address| peer-groupgroup-name}slow

Example:

Router# clear ip bgp * slow

(Optional) Restores neighbor(s) from a slow update peer group to their original update peer group.

af is one of the following address families: ipv4, vpnv4, or vpnv6. Moves all peers in the IPv4, VPNv4 or VPNv6 address family back to their original update groups.

* moves all peers back to their original update groups.

Step 3

clearbgpaf{*| neighbor-address|peer-groupgroup-name}slow

Example:

Router# clear bgp ipv4 * slow

(Optional) Restores neighbor(s) from slow update peer group to their original update peer group.

afis one of the following address families: ipv4, vpnv4, or vpnv6. Moves peers in the IPv4, VPNv4 or VPNv6 address family back to their original update groups.

* moves all peers in the address family back to their original update groups.

Configuration Examples for Detecting and Mitigating a BGP Slow Peer

Example: Static Slow Peer

The following example marks the neighbor at 192.168.12.10 as a static slow peer.

Example: Dynamic Slow Peers Using Peer Policy Template

In the following example, Router A uses a peer policy template named ipv4_ucast_pp1 and sets a detection threshold of 120 seconds. The
permanent keyword causes slow peers to remain in the slow update group until the network administrator uses the
clearipbgpslow command to move the peer to its original update group. The neighbor at 10.0.101.2 inherits the peer policy, which means that if that neighbor is determined to be slow, it is moved to a slow update group.

Example: Dynamic Slow Peers Using Peer Group

The following example configures two peer groups: ipv4_ucast_pg1 and ipv4_ucast_pg2. The neighbor at 10.0.101.1 belongs to ipv4_ucast_pg1, where slow peer detection is configured for 120 seconds. The neighbor at 10.0.101.5 belongs to ipv4_ucast_pg2, where slow peer detection is configured at 140 seconds.

MIBs

Technical Assistance

Description

Link

The Cisco Support and Documentation website provides online resources to download documentation, software, and tools. Use these resources to install and configure the software and to troubleshoot and resolve technical issues with Cisco products and technologies. Access to most tools on the Cisco Support and Documentation website requires a Cisco.com user ID and password.

Feature Information for Detecting and Mitigating a BGP Slow Peer

The following table provides release information about the feature or features described in this module. This table lists only the software release that introduced support for a given feature in a given software release train. Unless noted otherwise, subsequent releases of that software release train also support that feature.

Use Cisco Feature Navigator to find information about platform support and Cisco software image support. To access Cisco Feature Navigator, go to
www.cisco.com/​go/​cfn. An account on Cisco.com is not required.

The BGP Slow Peer feature allows a network administrator to detect a BGP slow peer and also to configure a peer as a slow peer statically or dynamically mark it.

BGP slow peer detection identifies a BGP peer that is not transmitting update messages within a configured amount of time. It is helpful to know if there is a slow peer, which indicates there is a network issue that the network administrator can address.

BGP slow peer configuration causes the peer to be moved from its normal update group to a slow update group, thus allowing the normal update group to function without being slowed down and to converge quickly.