Limiting Failovers Caused by Route Monitors in non-INC mode

Sep 01, 2016

In an HA configuration in non-INC mode, if route monitors fail on both nodes, failover happens every 180 seconds until one of the nodes is able to reach all of the routes monitored by the respective route monitors.

However, for a node, you can limit the number of failovers for a given interval by setting the Maximum Number of Flips and Maximum Flip Time parameters on the nodes. When either limit is reached, no more failovers occur, and the node is assigned as primary (but node state as NOT UP) even if any route monitor fails on that node. This combination of HA state as Primary and Node state as NOT UP is called Stick Primary state.

If the node is then able to reach all of the monitored routes, the next monitor failure triggers resetting of the Maximum Number of Flips and Maximum Flip Time parameters on the node and starting the time specified in the Maximum Flip Time parameter.

These parameters are set independently on each node and therefore are neither propagated nor synchronized.

Parameters for limiting the number of failovers

Maximum Number of Flips (maxFlips)

Maximum number of failovers allowed, within the Maximum Flip Time interval, for the node in HA in non INC mode, if the failovers are caused by route-monitor failure.

Maximum Flip Time ( maxFlipTime )

Amount of time, in seconds, during which failovers resulting from route-monitor failure are allowed for the node in HA in non INC mode.

SNMP Alarm for Sticky Primary State

Enable HA-STICKY-PRIMARY SNMP alarm in a node of a high availability set up if you want to be alerted of the node becoming sticky primary. When the node becomes sticky primary, it alerts by generating a trap message (stickyPrimary (1.3.6.1.4.1.5951.1.1.0.138)) and sends it to all the configured SNMP trap destinations. For more information about configuring SNMP alarms and trap destinations, see Configuring the NetScaler to Generate SNMPv1 and SNMPv2 Traps.

Frequently Asked Questions

Consider an example of a high availability setup of two NetScaler appliances NS-1 and NS-2 in non-INC mode. Maximum numbers of flips and maximum flip time in both the nodes have been set with the same values.

The following table lists the settings used in this example:

Entity

Detail

IP address of NS-1

10.102.173.211

IP address of NS-2

10.102.173.212

Maximum number of flips

2

Maximum flip time

200

The following table lists some FAQs and answers about maximum number of flips and maximum flip time settings:

Question

Answer

What must be the next plan of action after one of the node become sticky primary?

Rectify the routes, which are being monitored.

After the maximum flip time is elapsed, any route monitor failure triggers resetting of the Maximum number of flips and maximum flip time, then starting the time specified in maximum flip time.

The following example shows that NS-1 (10.102.173.211) becomes sticky primary.

What happens if a node recovers from sticky primary state before the maximum flip time is elapsed?

Nothing happens. Maximum number of flips and maximum flip time are not reset.

What happens if a node recovers from sticky primary state after the maximum flip time is elapsed?

Nothing happens. Maximum number of flips and maximum flip time are not reset.

What happens if a node recovers from sticky primary state and then the route that is being monitored goes down again before the maximum flip time is elapsed?

The node will again become sticky primary without a failover. Maximum number of flips and maximum flip time are not reset.

The following example shows that NS-1 (10.102.173.211) recovers from sticky primary state. NS-1 again becomes sticky primary when the route that is being monitored goes down again before the maximum flip time is elapsed.

What happens when maximum number of flips and maximum flip time are unset?

After the maximum number of flips and maximum flip time, the setup falls to the failover cycle of180 seconds until the route monitor state become UP.

What happens when maximum flip time is over but not the maximum number of flips and there is a route down event?

The setup goes to continuous flip cycle. If maximum flip time is over before the maximum flips are completed, both these parameters are reset to the configured values. As a result, the flip cycle continues forever. The maximum flip time must be configured in such a way that the maximum number of flips can be completed in this configured time.