Thursday, April 28, 2011

The BGP Best Path Selection Algorithm

BGP routers often receive multiple paths to the same destination. The BGP best path selection algorithm uses rules that extend far beyond than just choosing the route with the lowest metric when deciding the best path to be installed in the IP routing table for packet forwarding.

A path must first be considered with the following criteria before it is considered valid as a candidate for the best path:

If BGP synchronization is enabled, an IBGP-learned path is considered as a candidate only when the IP routing table contains the route. If the matching route is learned from an OSPF neighbor, its OSPF Router ID must match the BGP Router ID of the IBGP peer. The show ip bgp {prefix [mask]} EXEC command is able to display paths that marked as "not synchronized". BGP synchronization is disabled by default in Cisco IOS Software Release 12.2(8)T and later.

The next-hop address associated with the path is reachable. There is often an IGP route destined to the next-hop address associated with the path exists in the IP routing table. Next-hop addresses that are only reachable via a default route or another BGP route are not considered valid. Invalid paths due to unreachable next-hop addresses are marked as inaccessible in the output of the show ip bgp {prefix [mask]} EXEC command.

An EBGP-learned path is denied and not even installed in the BGP table – Routing Information Base (RIB) if the local ASN appears in the AS_PATH attribute for the path.

An EBGP-learned path will be denied by routing policies that implemented using access, prefix, AS_PATH, and COMMUNITY lists. Rejects paths are still stored in the BGP table and marked as receive-only in the output of the show ip bgp {prefix [mask]} command if the soft-reconfiguration inbound keyword is configured for the EBGP peer.

If the BGP Enforce the First Autonomous System Path feature is enabled using the bgp enforce-first-as BGP router subcommand, BGP discards a received EBGP-route that do not list the ASN of the EBGP peer as the first segment in the AS_SEQUENCE-type AS_PATH attribute for the path. This feature prevents a misconfigured or unauthorized peer from misdirecting or spoofing traffic by advertising a route as if it was sources from the remote autonomous system. Take note that this command is enabled by default on most recent IOS releases and does not affect the operation of the BGP Local-AS feature, in which another ASN (the local ASN) that is different that the remote ASN is actually prepended to the AS_SEQUENCE. Although the output of the show ip bgp EXEC command shows the manipulation of the AS_PATH, the EBGP peer does not actually advertise the local ASN in the AS_PATH.

BGP is not designed to perform load balancing; BGP chooses only a single best path to reach a specific destination. The best paths are chosen because of policies, not based on bandwidth. The BGP best path selection process evaluates multiple paths until a single best path is left. The best path is then submitted to the routing table manager process and to be evaluated against other routing protocols that can also reach that network using the administrative distance rule.
The routes that reside in the IP routing table (best paths) can then be advertised to other BGP peers.

The remaining paths to reach a specific destination are still kept in the BGP table in case the best path becomes inaccessible.

The Cisco IOS BGP implementation goes through the following process to choose the best route:

Prefer the route with the highest weight (Cisco-proprietary).

If multiple routes have the same weight, prefer the route with the highest local preference.Note: A path without the local preference is considered to have had the value set with the bgp default local-preference command, or to have a value of 100 by default, when being advertised to IBGP peers.

If multiple routes have the same local preference, prefer the route that was locally originated via a network or aggregate-address command, or through redistribution from an IGP. Local paths sourced by the network and redistribute commands are preferred over local aggregates sourced by the aggregate-address command. A locally originated route has a next-hop of 0.0.0.0 in the BGP table.

If the routes were not originated by the local router, prefer the route with the shortest AS path. The length of an AS_SET is counts as 1, regardless of the number of ASes in the set; AS_CONFED_SEQUENCE and AS_CONFED_SET do not determine the AS path length.Note: This step is skipped if the bgp bestpath as-path ignore hidden BGP router subcommand is configured.

If the routes have the same ORIGIN code, prefer the route with the lowest MED.
> The MED comparison only occurs if the routes were received from the same AS; unless the bgp always-compare-med BGP router subcommand is enabled.
> Any confederation sub-ASes are ignored – MEDs are being compared only if the first segment in the AS_SEQUENCE is the same for multiple routes; any preceding AS_CONFED_SEQUENCE is ignored.
> If the bgp bestpath med confed is enabled, MEDs are compared for all paths that consist only of AS_CONFED_SEQUENCE – paths that are originated within the local confederation.

If the routes have the same MED value, prefer EBGP paths over IBGP paths.Note: Paths that contain AS_CONFED_SEQUENCE and AS_CONFED_SET are local to the confederation, and therefore are treated as internal (IBGP) paths. Additionally, there is no difference between Confederation External and Confederation Internal routes.Reference: RFC 5065 – Autonomous System Confederations for BGP – Section 5.3 – 4.

If there is no EBGP neighbor but only IBGP neighbors and synchronization is disabled, prefer the path through the closest IBGP neighbor – the shortest path within the AS (lowest IGP metric) to reach the destination, in fact the BGP next-hop address for the route.
Continue, even if the best path is already selected.

The BGP Cost Extended Community attribute is an optional non-transitive attribute that is distributed to IBGP and confederation peers but not to EBGP peers. It provides a way to customize the local route preference and influence the best path selection process. This additional step which compares the BGP Cost Communities is added to the best path selection algorithm since the RFC is ratified. The path with the lowest cost value is preferred.
> This step is skipped if the bgp bestpath cost-community ignore BGP router subcommand (available in Cisco IOS Release 12.3(2)T and later) is configured.
> The set extcommunity route map clause configures Cost Community with a Cost Community ID number (0 to 255) and Cost value (0 to 4294967295). The Cost value determines the preference for a path. The path with the lowest Cost value is preferred. Paths that are not specifically configured with the Cost value are assigned with the default Cost value of 2147483647, which is the midpoint between 0 and 4294967295. These paths are then evaluated accordingly by the best path selection process. If 2 paths have the same Cost value, the path with the lower Cost Community ID is preferred.

BGP chooses only a single best path for each destination. If the maximum-paths {num} BGP router subcommand is configured, and there are multiple external or confederation-external parallel paths through the same neighboring AS or sub-AS, BGP inserts the most recently received parallel paths up to the maximum of the configurable number (6) into the IP routing table for EBGP multipath load sharing.
The default value when this command is not configured is 1 – no EBGP multipath load sharing.
Continue, if the best path is not yet selected.

When the routes are learned via EBGP, select the route that was received first (the oldest), as it is considered more stable and able to minimize the effect of route flapping. A newer path does not displace an older path, even if it is the preferred path based on the additional decision criteria below. It is recommended to apply the additional decision criteria below upon IBGP paths only, in order to ensure a consistent best path selection within an AS and thereby avoid routing loops.Note: This step is skipped if any of the following is true:
> The bgp bestpath compare-routerid BGP router subcommand is enabled, in which the route received from the EBGP peer with the lowest BGP Router ID is selected as the best path for identical EBGP paths.
> The BGP Router ID is the same for the routes – received from the same router.
> There is no current best path. The current best path can be lost when the neighbor offering the path goes down.

Prefer the path received from the IBGP peer with the lowest BGP Router ID.Note: If a path contains route reflection attributes, the ORIGINATOR_ID is used instead of the BGP Router ID.

If the ORIGINATOR_ID is the same for both paths, prefer the path with the shortest route reflection CLUSTER_LIST. The length of the cluster list for routes without the CLUSTER_LIST attribute is 0.

If the BGP Router ID is the same for both paths, prefer the path received from the BGP peer with the lowest neighbor address – the IP address of the remote peer as configured in the neighbor BGP router subcommand.