DMVPN-Tunnel Health Monitoring and Recovery Backup NHS

The DMVPN-Tunnel Health Monitoring and Recovery (Backup NHS) feature allows you to control the number of connections to the Dynamic Multipoint Virtual Private Network (DMVPN) hub and allows you to switch to alternate hubs in case of a connection failure to the primary hubs.

The recovery mechanism provided by the DMVPN-Tunnel Health Monitoring and Recovery (Backup NHS) feature allows spokes to recover from a failed spoke-to-hub tunnel path by replacing the tunnel by another active spoke-to-hub tunnel. Spokes can select the next hop server (NHS) [hub] from a list of NHSs configured on the spoke. You can configure priority values to the NHSs that control the order in which spokes select the NHS.

Finding Feature Information

Your software release may not support all the features documented in this module. For the latest caveats and feature information, see
Bug Search Tool and the release notes for your platform and software release. To find information about the features documented in this module, and to see a list of the releases in which each feature is supported, see the feature information table at the end of this module.

Use Cisco Feature Navigator to find information about platform support and Cisco software image support. To access Cisco Feature Navigator, go to
www.cisco.com/​go/​cfn. An account on Cisco.com is not required.

NHS States

An NHS attains different states while associating with the hubs to from a spoke-to-hub tunnel. The table below describes different NHS states.

Table 1 NHS States

State

Description

DOWN

NHS is waiting to get scheduled.

PROBE

NHS is declared as “DOWN” but it is still actively probed by the spoke to bring it “UP”.

UP

NHS is associated with a spoke to establish a tunnel.

NHS Priorities

NHS priority is a numerical value assigned to a hub that controls the order in which spokes select hubs to establish a spoke-to-hub tunnel. The priority value ranges from 0 to 255, where 0 is the highest and 255 is the lowest priority.

You can assign hub priorities in the following ways:

Unique priorities to all NHS.

Same priority level to a group of NHS.

Unspecified priority (value 0) for an NHS, a group of NHSs, or all NHSs.

NHS Clusterless Model

NHS clusterless model is a model where you assign the priority values to the NHSs and do not place the NHSs into any group. NHS clusterless model groups all NHSs to a default group and maintains redundant connections based on the maximum NHS connections configured. Maximum NHS connections is the number of NHS connections in a cluster that must be active at any point in time. The valid range for maximum NHS connections is from 0 to 255.

Priority values are assigned to the hubs to control the order in which the spokes select hubs to establish the spoke-to-hub tunnel. However, assigning these priorities in a clusterless model has certain limitations.

The table below provides an example of limitations for assigning priorities in a clusterless model.

Table 2 Limitations of Clusterless Mode

Maximum Number of Connections = 3

NHS

NHS Priority

Scenario 1

Scenario 2

NHS A1

1

UP

UP

NHS B1

1

UP

PROBE

NHS C1

1

UP

UP

NHS A2

2

DOWN

UP

NHS B2

2

DOWN

DOWN

NHS C2

2

DOWN

DOWN

Consider a scenario with three data centers A, B, and C. Each data center consists of two NHSs: NHSs A1 and A2 comprise one data center, NHS B1 and B2 another, and C1 and C3 another.

Although two NHSs are available for each data center, the spoke is connected to only one NHS of each data center at any point in time. Hence, the maximum connection value is set to 3. That is, three spoke-to-hub tunnels are established. If any one NHS, for example, NHS B1, becomes inactive, the spoke-to-hub tunnel associated with NHS B1 goes down. Based on the priority model, NHS A2 has the next priority value and the next available NHS in the queue, so it forms the spoke-to-hub tunnel and goes up. However, this does not meet the requirement that a hub from data center B be associated with the spoke to form a tunnel. Hence, no connection is made to data center B.

This problem can be addressed by placing NHSs into different groups. Each group can be configured with a group specific maximum connection value. NHSs that are not assigned to any groups belong to the default group.

NHS Clusters

The table below presents an example of cluster functionality. NHSs corresponding to different data centers are grouped to form clusters. NHS A1 and NHS A2 with priority 1 and 2, respectively, are grouped as cluster1, NHS B1 and NHS B2 with prirority 1 and 2, respectively, are grouped as cluster2, and NHS C1 and NHS C2 with prirority 1 and 2, respectively, are grouped as cluster3. NHS 7, NHS 8, and NHS 9 are part of the default cluster. The maximum cluster value is set to 1 for each cluster so that at least one spoke-to-hub tunnel is continuously established with all the four clusters.

In scenario 1, NHS A1, NHS B1, and NHS C1 with the highest priority in each cluster are in the UP state. In scenario 2, the connection between the spoke and NHS A1 breaks, and a connection is established between the spoke and NHS A2 (hub from the same cluster). NHS A1 with the highest priority attains the PROBE state. In this way, at any point in time a connection is established to all the three data centers.

Table 3 Cluster Functionality

NHS

NHS Priority

Cluster

Maximum Number of Connections

Scenario

1

Scenario

2

NHS A1

1

1

1

UP

PROBE

NHS A2

2

DOWN

UP

NHS B1

1

2

1

UP

UP

NHS B2

2

DOWN

DOWN

NHS C1

1

3

1

UP

UP

NHS C2

2

DOWN

DOWN

NHS 7

1

Default

2

UP

DOWN

NHS 8

2

UP

UP

NHS 9

0

PROBE

UP

NHS Fallback Time

Fallback time is the time that the spoke waits for the NHS to become active before detaching itself from an NHS with a lower priority and connecting to the NHS with the highest priority to form a spoke-to-hub tunnel. Fallback time helps in avoiding excessive flaps.

The table below shows how the spoke flaps from one NHS to another excessively when the fallback time is not configured on the spoke. Five NHSs having different priorities are available to connect to the spoke to form a spoke-to-hub tunnel. All these NHSs belong to the default cluster. The maximum number of connection is one.

Table 4 NHS Behavior when Fallback Time is not Configured

NHS

NHS Priority

Cluster

Scenario 1

Scenario 2

Scenario 3

Scenario 4

Scenario 5

NHS 1

1

Default

PROBE

PROBE

PROBE

PROBE

UP

NHS 2

2

Default

PROBE

PROBE

PROBE

UP

DOWN

NHS 3

3

Default

PROBE

PROBE

UP

DOWN

DOWN

NHS 4

4

Default

PROBE

UP

DOWN

DOWN

DOWN

NHS 5

5

Default

UP

DOWN

DOWN

DOWN

DOWN

In scenario 1, NHS 5 with the lowest priority value is connected to the spoke to form a tunnel. All the other NHSs having higher priorities than NHS 5 are in the PROBE state.

In scenario 2, when NHS 4 becomes active, the spoke breaks connection with the existing tunnel and establishes a new connection with NHS 4. In scenario 3 and scenario 4, the spoke breaks the existing connections as soon as an NHS with a higher priority becomes active and establishes a new tunnel. In scenario 5, as the NHS with the highest priority (NHS 1) becomes active, the spoke connects to it to form a tunnel and continues with it until the NHS becomes inactive. Because NHS 1 is having the highest priority, no other NHS is in the PROBE state.

The table below shows how to avoid the excessive flapping by configuring the fallback time. The maximum number of connection is one. A fallback time period of 30 seconds is configured on the spoke. In scenario 2, when an NHS with a higher priority than the NHS associated with the spoke becomes active, the spoke does not break the existing tunnel connection until the fallback time. Hence, although NHS 4 becomes active, it does not form a tunnel and attain the UP state. NHS 4 remains active but does not form a tunnel untill the fallback time elapses. Once the fallback time elapses, the spoke connects to the NHS having the highest priority among the active NHSs.

This way, the flaps that occur as soon as an NHS of higher priority becomes active are avoided.

Table 5 NHS Behavior when Fallback Time is Configured

NHS

NHS Priority

Cluster

Scenario 1

Scenario 2

Scenario 3

Scenario 4

Scenario 5

NHS 1

1

Default

PROBE

PROBE

PROBE

UP-hold

UP

NHS 2

2

Default

PROBE

PROBE

UP-hold

UP-hold

DOWN

NHS 3

3

Default

PROBE

UP-hold

UP-hold

UP-hold

DOWN

NHS 4

4

Default

UP-hold

UP-hold

UP-hold

UP-hold

DOWN

NHS 5

5

Default

UP

UP

UP

UP

DOWN

NHS Recovery Process

NHS recovery is a process of establishing an alternative spoke-to-hub tunnel when the existing tunnel becomes inactive, and connecting to the preferred hub upon recovery.

Alternative Spoke to Hub NHS Tunnel

When a spoke-to-hub tunnel fails it must be backed up with a new spoke-to-hub tunnel. The new NHS is picked from the same cluster to which the failed hub belonged. This ensures that the required number of spoke-to-hub tunnels are always present although one or more tunnel paths are unavailable.

The table below presents an example of NHS backup functionality.

Table 6 NHS Backup Functionality

NHS

NHS Priority

Cluster

Maximum Number of Connections

Scenario

1

Scenario

2

Scenario

3

NHS A1

1

1

1

UP

PROBE

PROBE

NHS A2

2

DOWN

UP

DOWN

NHS A3

2

DOWN

DOWN

UP

NHS A4

2

DOWN

DOWN

DOWN

NHS B1

1

3

1

UP

PROBE

PROBE

NHS B2

2

DOWN

UP

DOWN

NHS B3

2

DOWN

DOWN

UP

NHS B4

2

DOWN

DOWN

DOWN

NHS 9

Default

Default

1

UP

UP

DOWN

NHS 10

DOWN

DOWN

UP

Four NHSs belonging to cluster 1 and cluster 3 and two NHSs belonging to the default cluster are available for setting up spoke-to-hub tunnels. All NHSs have different priorities. The maxmum number of connections is set to 1 for all the three clusters. That is, at any point in time, at least one NHS from each cluster must be connected to the spoke to form a tunnel.

In scenario 1, NHS A1 from cluster 1, NHS B1 from cluster 3, and NHS 9 from the default cluster are UP. They establish a contact with the spoke to form different spoke-to-hub tunnels. In scenario 2, NHS A1 and NHS B1 with the highest priority in their respective clusters become inactive. Hence a tunnel is established from the spoke to NHS A2 and NHS B2, which have the next highest priority values. However, the spoke continues to probe NHS A1 and NHS B1 because they have the highest priority. Hence, NHS A1 and NHS B1 remain in the PROBE state.

In scenario 3, NHS A2, NHS B2, and NHS 9 become inactive. The spoke checks if the NHSs in PROBE state have turned active. If yes, then the spoke establishes a connection to the NHS that has turned active. However, as shown in scenario 3, because none of the NHSs in the PROBE state is active, the spoke connects to NHS A3 of cluster 1 and NHS B3 of cluster 2. NHS A1 and NHS B1 continue to be in the PROBE state until they associate themselves with the spoke to form a tunnel and attain the UP state.

Returning to Preferred NHS Tunnel upon Recovery

When a spoke-to-hub tunnel fails, a backup tunnel is established using an NHS having the next higher priority value. Even though the tunnel is established with an NHS of lower priority, the spoke continuously probes the NHS having the highest priority value. Once the NHS having the highest priority value becomes active, the spoke establishes a tunnel with the NHS and hence the NHS attains the UP state.

The table below presents NHS recovery functionality. Four NHSs belonging to cluster 1 and cluster 3 and two NHSs belonging to the default cluster are available for setting up spoke-to-hub tunnels. All NHSes have different priorities. The maximum connection value is set to 1. In scenario 1, NHS A4, NHS B4, and NHS 10 with the least priority in their respective clusters associate with the spoke in establishing a tunnel. The spoke continues to probe NHSs of higher prirority to establish a connection with the NHS having the highest priority value. Hence, in scenario 1, NHSs having the highest priority value in their respective clusters are in the PROBE state. In scenario 2, NHS A1 is ACTIVE, forms a tunnel with the spoke, and attains the UP state. Because NHS A1 has the highest priority, the spoke does not probe any other NHS in the cluster. Hence, all the other NHSs in cluster1 are in the DOWN state.

When the connection with NHS B4 breaks, the spoke connects to NHS B3, which has the next higher priority value, because NHS B1 of cluster 3 is not active. In scenario 3, NHS A1 continues to be in the UP state and NHS B1 with the highest priority in cluster 2 becomes active, forms a tunnel, and attains the UP state. Hence, no other NHSs in cluster 2 are in the PROBE state. However, because NHS 10 having the lowest priority value in the default cluster is in the UP state, the spoke continues to probe NHS 9 having the highest priority in the cluster.

In scenario 4, NHS A1 and NHS B1 continue to be in the UP state and NHS 9 having the highest priority in the default cluster attains the UP state. Hence, because the spoke is associated with the NHSs having the highest priority in all the clusters, none of the NHSs are in the PROBE state.

Table 7 NHS Recovery Functionality

NHS

NHS Priority

Cluster

Maximum Number of Connections

Scenario

1

Scenario

2

Scenario

3

Scenario

4

NHS A1

1

1

1

PROBE

UP

UP

UP

NHS A2

2

DOWN

DOWN

DOWN

DOWN

NHS A3

2

DOWN

DOWN

DOWN

DOWN

NHS A4

2

UP

DOWN

DOWN

DOWN

NHS B1

1

3

1

PROBE

PROBE

UP

UP

NHS B2

10

PROBE

DOWN

DOWN

DOWN

NHS B3

10

PROBE

UP

DOWN

DOWN

NHS B4

30

UP

DOWN

DOWN

DOWN

NHS 9

Default

Default

1

PROBE

PROBE

PROBE

UP

NHS 10

100

UP

UP

UP

DOWN

How to Configure DMVPN-Tunnel Health Monitoring and Recovery Backup NHS

RFCs

RFC

Title

No new or modified RFCs are supported by this feature.

--

Technical Assistance

Description

Link

The Cisco Support and Documentation website provides online resources to download documentation, software, and tools. Use these resources to install and configure the software and to troubleshoot and resolve technical issues with Cisco products and technologies. Access to most tools on the Cisco Support and Documentation website requires a Cisco.com user ID and password.

The following table provides release information about the feature or features described in this module. This table lists only the software release that introduced support for a given feature in a given software release train. Unless noted otherwise, subsequent releases of that software release train also support that feature.

Use Cisco Feature Navigator to find information about platform support and Cisco software image support. To access Cisco Feature Navigator, go to
www.cisco.com/​go/​cfn. An account on Cisco.com is not required.

The DMVPN-Tunnel Health Monitoring and Recovery (Backup NHS) feature allows you to control the number of connections to the DMVPN hub and allows you to switch to alternate hubs in case of connection failure to primary hubs.

The following commands were introduced or modified:
ipnhrpnhs,
ipv6nhrpnhs,
showipnhrpnhs,
showipv6nhrpnhs.