Learn with our tutorials and training

developerWorks provides tutorials, articles and other
technical resources to help you grow your development skills
on a wide variety of topics and products. Learn about a specific
product or take a course and get certified. So, what do you want to learn
about?

Featured products

Featured destinations

Find a community and connect

Learn from the experts and share with other developers in one of our
dev centers. Ask questions and get answers with dW answers. Search for local events
in your area. All in developerWorks communities.

Contents

Tips for implementing PowerHA in a virtual I/O environment

Chris GibsonPublished on February 16, 2010

In this article, I'll share a few of my tips for building a PowerHA™ cluster within a
virtual I/O (VIO) environment. I'll briefly describe an LPAR and VIO server (VIOS)
design and layout for a simple two-node PowerHA cluster. However, I won't go into specific
PowerHA configuration, as that topic is too large to cover in detail here. For
in-depth information, I'll refer you to the official IBM PowerHA documentation (see Related topics). This article also assumes that you have experience with
AIX®, VIO, and PowerHA.

Overview

The example environment covered by this article consists of two POWER6® 595
servers. Each 595 is configured with dual
VIO servers for redundancy, and a two-node cluster has been built across the two physical
frames, that is, one PowerHA node resides on each Power 595 server. The LPARs are running
AIX 5.3 TL7 SP5 with PowerHA 5.4.1.3. Each VIOS was built with version 1.5.2.1-FP11.1
across the virtual I/O landscape. Figure 1 shows this configuration.

Figure 1. PowerHA cluster overview

In the following sections, I will briefly touch on the virtual network and virtual
(shared) storage configuration for the cluster nodes. In particular, I will highlight these areas:

PowerHA boot and service network and addresses

Shared Ethernet Adapter (SEA) configuration for the PowerHA network

Shared volume group considerations

Virtual network

The virtual network configuration is an important aspect of the PowerHA configuration.
Figure 2 shows how the VIOS network is configured; in this example, on one 595 frame.
The VIOS network configuration is duplicated on the second frame. (Click to view a
larger image.)

Figure 2. VIOS network overview

As shown in Figure 2, there are PowerHA and non-HA LPARs as clients of the
same VIOS pair. You'll also notice multiple SEAs, that is, one per VLAN and usage type:
PUBLIC, BACKUP, and PowerHA. Each VLAN has a unique IP
range: PUBLIC 10.2.2, BACKUP 10.3.3 and PowerHA 10.1.1.
There's also an interface on each LPAR, on the 10.4.4 network that is used for internal (private) communication between the LPARs over the POWER Hypervisor virtual network.

The HA nodes communicate with the outside world through VLAN40 (PVID40/41), which is
the PowerHA network. The non-HA LPARs communicate through VLAN10 (PVID10), over the PUBLIC network. There's also another SEA in each VIOS, on VLAN20, which is used as a dedicated VLAN for backups over the network, hence the network name BACKUP.

Shared Ethernet Adapter failover (SEA FO) is configured for both the PUBLIC and BACKUP
networks. There is no SEA FO for the PowerHA network. If a SEA fails on a VIOS, for the PowerHA network, then the service IP will move to the other boot adapter, served by the redundant VIOS.

There's no VLAN tagging in use for any of the SEAs. There's no need, as there is only a
handful of VLANs to deal with in this network. However, your requirements may differ.

When viewing the PowerHA cluster network, with the cltopinfo command, the Network
definitions on each node are as follows:

As you can see, the service and boot adapters are all in the same subnetted (segmented) IP
network, where b1v1 defines the first boot adapter (b1) associated with the first VIOS
(v1) and so on. The service address is the hostname without adm appended to it.

Typically, when configuring a SEA on a VIOS, you would deploy SEA Fail Over to ensure
network connectivity was protected in the event of a VIOS failure. However, in this
PowerHA environment, the approach is different. SEA FO is not used for the PowerHA network. This way, PowerHA is aware of, and controls, network failure and failover. In this case, there is one SEA for the PowerHA network in each VIOS. If a VIOS fails, the service address moves to the boot adapter served by the redundant VIOS.

The main driver for this approach is the way the PowerHA cluster communicates in a virtual network environment. If SEA FO was configured and a failure occurred, HA would have no way of detecting the failure. Likewise, if all communication at the physical layer was lost, HA would continue to think the network was OK, as it is still able to route traffic across the virtual LAN on the Hypervisor.

This is why it is important to configure the netmon.cf file on all nodes in the cluster.

This file instructs HA on how to determine when it has lost connectivity with the network or its partner HA nodes. If this file is not configured appropriately, network failures could go undetected by PowerHA.

The netmon.cf file and VIO

There are two APARS that I recommend you review in relation to configuring the netmon.cf file in a VIO environment. You'll soon understand why this file is important and when it should be implemented.

APAR IZ01331 describes the scenarios of using VIO with PowerHA clusters and the
challenges faced in detecting network failures. For example, if an "entire CEC is unplugged from the network, the PowerHA node on that Frame does not detect a local adapter down event, because traffic being passed between the VIO clients (on the same frame) looks like normal external traffic from the perspective of the LPAR's OS."

To get around this problem, the netmon.cf file is used to allow customers to declare that a given adapter should only be considered up if it can ping a set of specified targets.

If the VIOS has multiple physical interfaces on the same network or if there are two or more PowerHA nodes using one or more VIOS in the same frame, PowerHA will not be informed of (and hence will not react to) individual physical interface failures.

In the extreme case where all physical interfaces managed by VIO Servers have failed,
the VIOS will continue to route traffic from one LPAR to another in the same frame,
the virtual ethernet interface used by PowerHA will not be reported as having failed, and PowerHA will not react.

Each node in the cluster has a custom netmon.cf file that lists all the IP addresses it
must be able to ping for it to mark an interface up or down. For
example, aix01adm resides on Frame 1 (595-1) and aix02adm resides on Frame 2 (595-2). If all network
connectivity was lost for all physical interfaces on all VIOS on 595-1, then aix01adm would still continue functioning, as it would still be able to route packets over the virtual network. For this node (and others) to detect the problem, you populate the netmon.cf file with addresses it should be able to reach on specific interfaces. If it can't, then those interfaces are marked as down and PowerHA is able to react accordingly.

APAR IZ01874 clarifies how to choose IP addresses for the netmon.cf file. This file
should contain remote IP addresses and host names that are not in the cluster
configuration that can be accessed through the PowerHA network interfaces. These addresses must be preceded by !REQD.

Some good choices for targets are name servers (DNS servers) and gateways (routers), or
reliable external IP addresses (such as NTP servers) that will respond to a ping. You can
use the following ping command to verify that a ping will be answered on a specific interface:

# ping -S <Boot IP address> <IP addr in netmon.cf>

Where <Boot IP address> is the IP address configured on the boot interface. For example,

The !REQD tag specifies that the adapter (aix01b1v1)
will only be considered up if it can ping the target (aix02b1v3). The aix01b1v1 entry specifies
which interface to use for the test, that is, aix01b1v1 resolves to 10.1.1.76, which
is the address on the en2 interface. This interface will be considered up if it is able to ping the target, aix02b1v3.

en2 will be used to connect to aix02b1v3, which is an interface on its partner node on
595-2. If it cannot communicate, the interface en2 (aix01b1v1) will be marked as down.
Do not include any nodes in this file that exist on the same frame. All entries should
be for systems that reside outside of the physical frame to ensure the detection of real, physical network failures to the outside world on the physical (not virtual) network.

Be careful not to specify an interface name in the netmon.cf file, such as:

!REQD en2 10.1.1.10

Including the interface name will not work in a VIO environment. The last time I
checked, there was a Design Change Request (DCR) in with the HA development team to
overcome this issue. Some customers have experienced a slow takeover due to the way
RSCT (netmon) determines if the second field in netmon.cf is an IP/hostname or the
name of an interface. In some cases, netmon will attempt to resolve the IP address of
the hostname, for example, $ host en2, which will fail. IBM
development is working on a new algorithm to prevent interface names from being
treated as host names, especially for obvious formats such as enX. For now it's best
to eliminate the use of the interface name, for example, enX, in the netmon.cf file.

It's recommended to only use the netmon.cf method if it is appropriate in your VIO
environment. Using this method changes the definition of a so-called good adapter from,"Am I able to receive any network traffic?" to "Can I successfully ping certain addresses? (regardless of how much traffic I can see)".

This can make it more likely for an adapter to be falsely considered down. If you must
use this new function, I recommend that you include as many targets as possible for each interface you need to monitor.

Virtual (shared) storage

The IBM technical documentation relating to PowerHA and Virtual SCSI (VSCSI) clearly
defines the supported storage configuration in a VIO environment. The shared volume
group (VG) must be defined as "Enhanced Concurrent Mode." In general, Enhanced
Concurrent Mode is the recommended mode for sharing volume groups in PowerHA clusters. In this mode, the shared volume groups are accessible by multiple PowerHA nodes, which results in faster failover (disk takeover) in the event of a node failure. All volume group administration on these shared disks is done from the PowerHA nodes, not from the VIOS.

In the example environment, running lspv on the primary node confirms the shared volume group is in concurrent mode.

Figure 3. VIOS VSCSI overview

The primary node has ownership of the shared volume group as it is varied-on and
active. I can confirm this by running the lsvg command on
the primary and taking note of some its characteristics. The VG
STATE is active, VG Mode is Concurrent, Concurrent is set to Enhanced-Capable, and VG PERMISSION is read/write. The logical volumes in the shared volume group are open.

File systems on the standby nodes are not mounted until the point of failover, so
accidental use of data on standby nodes is not possible. On the standby node, it has
access to the shared enhanced-concurrent volume group, but only in a passive,
read-only mode. The VG PERMISSION is set to passive-only. The logical volumes in the shared volume group are closed.

The bos.clvm.enh fileset must be installed (on all nodes in the cluster) to support enhanced concurrent volume groups. A new subsystem (gsclvmd) is started with enhanced concurrent volume groups. You can query this subsystem to determine the active enhanced concurrent volume groups.

To enable a shared volume group for enhanced concurrent mode (Fast Disk Takeover), you can use CSPOC.

Listing 11. Enabling enhanced concurrent mode

# smit cl_vg
Shared Volume Groups
Move cursor to desired item and press Enter.
List All Shared Volume Groups
Create a Shared Volume Group
Create a Shared Volume Group with Data Path Devices
Enable a Shared Volume Group for Fast Disk Takeover
Set Characteristics of a Shared Volume Group
Import a Shared Volume Group
Mirror a Shared Volume Group
Unmirror a Shared Volume Group

Refer to the IBM technical documentation and PowerHA documentation for more information relating to PowerHA and virtual storage support.

Summary

This is just one approach to this type of configuration. I hope these brief tips provide you with some ideas on how to approach PowerHA in a VIO environment.