This appendix provides an alternative to
host-based replication that does not use Sun Cluster Geographic Edition. Sun
recommends that you use Sun Cluster Geographic Edition for host-based replication
to simplify the configuration and operation of host-based replication within
a cluster. See Understanding Data Replication.

The example in this appendix shows how to configure host-based data
replication between clusters using Sun StorageTek Availability Suite 3.1 or 3.2 software or Sun StorageTek Availability Suite 4.0
software. The example illustrates a complete cluster configuration for an
NFS application that provides detailed information about how individual tasks
can be performed. All tasks should be performed in the global-cluster voting
node. The example does not include all of the steps that are required by other
applications or other cluster configurations.

If you use role-based access control (RBAC) instead of superuser to
access the cluster nodes, ensure that you can assume an RBAC role that provides
authorization for all Sun Cluster commands. This series of data replication
procedures requires the following Sun Cluster RBAC authorizations if the
user is not superuser:

solaris.cluster.modify

solaris.cluster.admin

solaris.cluster.read

See Chapter 2, Sun Cluster and RBAC for more information about using RBAC roles. See
the Sun Cluster man pages for the RBAC authorization that each Sun Cluster subcommand
requires.

Understanding Sun StorageTek Availability Suite Software in
a Cluster

Disaster tolerance is the ability of a system to restore an application
on an alternate cluster when the primary cluster fails. Disaster tolerance
is based on data replication and failover.
Failover is the automatic relocation of a resource group or device group from
a primary cluster to a secondary cluster. If the primary cluster fails, the
application and the data are immediately available on the secondary cluster.

This section describes the remote mirror replication method and the
point-in-time snapshot method used by Sun StorageTek Availability Suite software. This software
uses the sndradm(1RPC) and iiadm(1II) commands to replicate data.

Remote Mirror Replication

Figure A–1 shows remote mirror replication. Data from the
master volume of the primary disk is replicated to the master volume of the
secondary disk through a TCP/IP connection. A remote mirror bitmap tracks
differences between the master volume on the primary disk and the master volume
on the secondary disk.

Figure A–1 Remote Mirror Replication

Remote mirror replication can be performed synchronously in real time,
or asynchronously. Each volume set in each cluster can be configured individually,
for synchronous replication or asynchronous replication.

In synchronous data replication, a write operation is not confirmed
as complete until the remote volume has been updated.

In asynchronous data replication, a write operation
is confirmed as complete before the remote volume is updated. Asynchronous
data replication provides greater flexibility over long distances and low
bandwidth.

Point-in-Time Snapshot

Figure A–2 shows point-in-time snapshot. Data from the master
volume of each disk is copied to the shadow volume on the same disk. The point-in-time
bitmap tracks differences between the master volume and the shadow volume.
When data is copied to the shadow volume, the point-in-time bitmap is reset.

Figure A–2 Point-in-Time Snapshot

Replication in the Example Configuration

Figure A–3 illustrates
how remote mirror replication and point-in-time snapshot are used in this
example configuration.

Figure A–3 Replication in the Example Configuration

This section provides guidelines for
configuring data replication between clusters. This section also contains
tips for configuring replication resource groups and application resource
groups. Use these guidelines when you are configuring data replication for
your cluster.

Configuring Replication Resource Groups

Replication
resource groups collocate the device group under Sun StorageTek Availability Suite software control
with the logical hostname resource. A replication resource group must have
the following characteristics:

Be a failover resource group

A failover resource
can run on only one node at a time. When a failover occurs, failover resources
take part in the failover.

Have a
logical hostname resource

The logical hostname must be hosted
by the primary cluster. After a failover, the logical hostname must be hosted
by the secondary cluster. The Domain Name System (DNS) is used to associate
the logical hostname with a cluster.

Have an HAStoragePlus resource

The HAStoragePlus resource
enforces the failover of the device group when the replication resource group
is switched over or failed over. Sun Cluster software also enforces the
failover of the replication resource group when the device group is switched
over. In this way, the replication resource group and the device group are
always colocated, or mastered by the same node.

The following
extension properties must be defined in the HAStoragePlus resource:

GlobalDevicePaths.
This extension property defines the device group to which a volume belongs.

AffinityOn property = True.
This extension property causes the device group to switch over or fail over
when the replication resource group switches over or fails over. This feature
is called an affinity switchover.

ZPoolsSearchDir. This extension property is required for using
ZFS file system.

Be named after the device group with which it is colocated, followed
by -stor-rg

For example, devgrp-stor-rg.

Be online on both the primary cluster and the secondary cluster

Configuring Application Resource Groups

To be
highly available, an application must be managed as a resource in an application
resource group. An application resource group can be configured for a failover
application or a scalable application.

Application resources and application resource groups configured on
the primary cluster must also be configured on the secondary cluster. Also,
the data accessed by the application resource must be replicated to the secondary
cluster.

This section provides guidelines for configuring the following application
resource groups:

Configuring Resource Groups for a Failover
Application

In a failover application, an application runs on one node at a time.
If that node fails, the application fails over to another node in the same
cluster. A resource group for a failover application must have the following
characteristics:

Have an HAStoragePlus resource to enforce the failover of
the device group when the application resource group is switched over or failed
over

The device group is colocated with the replication resource
group and the application resource group. Therefore, the failover of the application
resource group enforces the failover of the device group and replication resource
group. The application resource group, the replication resource group, and
the device group are mastered by the same node.

Note, however,
that a failover of the device group or the replication resource group does
not cause a failover of the application resource group.

If the application data is globally mounted, the presence
of an HAStoragePlus resource in the application resource group is not required
but is advised.

If the application data is mounted locally, the presence of
an HAStoragePlus resource in the application resource group is required.

Without an HAStoragePlus resource, the failover of the application
resource group would not trigger the failover of the replication resource
group and device group. After a failover, the application resource group,
replication resource group, and device group would not be mastered by the
same node.

Must be online on the primary cluster and offline on the secondary
cluster

The application resource group must be brought online
on the secondary cluster when the secondary cluster takes over as the primary
cluster.

Figure A–4 illustrates
the configuration of an application resource group and a replication resource
group in a failover application.

Figure A–4 Configuration of Resource Groups
in a Failover Application

Configuring Resource Groups for a Scalable
Application

In a scalable application, an application runs on several nodes to create
a single, logical service. If a node that is running a scalable application
fails, failover does not occur. The application continues to run on the other
nodes.

When a scalable application is managed as a resource in an application
resource group, it is not necessary to collocate the application resource
group with the device group. Therefore, it is not necessary to create an HAStoragePlus resource
for the application resource group.

A resource group for a scalable application must have the following
characteristics:

Have a dependency on the shared address resource group

The
nodes that are running the scalable application use the shared address to
distribute incoming data.

Be online on the primary cluster and offline on the secondary
cluster

Figure A–5 illustrates
the configuration of resource groups in a scalable application.

Figure A–5 Configuration of Resource Groups
in a Scalable Application

Guidelines for Managing a Failover

If the primary cluster
fails, the application must be switched over to the secondary cluster as soon
as possible. To enable the secondary cluster to take over, the DNS must be
updated.

The DNS associates a client with the logical
hostname of an application. After a failover, the DNS mapping to the primary
cluster must be removed, and a DNS mapping to the secondary cluster must be
created. Figure A–6 shows
how the DNS maps a client to a cluster.

Connecting and Installing the Clusters

Figure A–7 illustrates
the cluster configuration the example configuration uses. The secondary cluster
in the example configuration contains one node, but other cluster configurations
can be used.

Figure A–7 Example Cluster Configuration

Table A–2 summarizes the hardware and software that the example configuration
requires. The Solaris OS, Sun Cluster software, and volume manager software
must be installed on the cluster nodes before Sun StorageTek Availability Suite software
and patches are installed.

Table A–2 Required Hardware and
Software

Hardware or Software

Requirement

Node hardware

Sun StorageTek Availability Suite software is supported on all servers that use Solaris OS.

The following table lists the names of the groups and resources that
are created for the example configuration.

Table A–3 Summary of the Groups
and Resources in the Example Configuration

Group or Resource

Name

Description

Device group

devgrp

The device group

Replication resource group and resources

devgrp-stor-rg

The replication resource group

lhost-reprg-prim, lhost-reprg-sec

The logical host names for the replication resource group on the primary
cluster and the secondary cluster

devgrp-stor

The HAStoragePlus resource for the replication resource group

Application resource group and resources

nfs-rg

The application resource group

lhost-nfsrg-prim, lhost-nfsrg-sec

The logical host names for the application resource group on the primary
cluster and the secondary cluster

nfs-dg-rs

The HAStoragePlus resource for the application

nfs-rs

The NFS resource

With the exception of devgrp-stor-rg, the names of
the groups and resources are example names that can be changed as required.
The replication resource group must have a name with the format devicegroupname-stor-rg.

Specifies that the SUNW.HAStoragePlus resource
must perform an affinity switchover for the global devices and cluster file
systems defined by -x GlobalDevicePaths=. Therefore, when
the replication resource group fails over or is switched over, the associated
device group is switched over.

Specifies that the SUNW.HAStoragePlus resource
must perform an affinity switchover for the global devices and cluster file
systems defined by -x GlobalDevicePaths=. Therefore, when
the replication resource group fails over or is switched over, the associated
device group is switched over.

Specifies the directory into which the resources in the group
can write administrative files.

Auto_start_on_new_cluster=False

Specifies that the application resource group is not started
automatically.

RG_dependencies=devgrp-stor-rg

Specifies the resource group that the application resource
group depends on. In this example, the application resource group depends
on the replication resource group devgrp-stor-rg.

If the application resource group is switched over to a new primary
node, the replication resource group is automatically switched over. However,
if the replication resource group is switched over to a new primary node,
the application resource group must be manually switched over.

Specifies that the application resource must perform an affinity
switchover for the global devices and cluster file systems defined by -p
GlobalDevicePaths=. Therefore, when the application resource group
fails over or is switched over, the associated device group is switched over.

Specifies a directory into which the resources in the group
can write administrative files.

Auto_start_on_new_cluster=False

Specifies that the application resource group is not started
automatically.

RG_dependencies=devgrp-stor-rg

Specifies the resource groups that the application resource
group depends on. In this example, the application resource group depends
on the replication resource group.

If the application resource group is switched over to a new primary
node, the replication resource group is automatically switched over. However,
if the replication resource group is switched over to a new primary node,
the application resource group must be manually switched over.

Specifies that the application resource must perform an affinity
switchover for the global devices and cluster file systems defined by -x
GlobalDevicePaths=. Therefore, when the application resource group
fails over or is switched over, the associated device group is switched over.

Next Steps

Example of How to Enable Data Replication

This section describes how data replication is enabled for the example
configuration. This section uses the Sun StorageTek Availability Suite software commands sndradm and iiadm. For more information about these
commands, see the Sun StorageTek Availability documentation.

How to Enable Replication on the Primary
Cluster

Access nodeA as superuser or assume a role that provides solaris.cluster.read RBAC authorization.

Flush all transactions.

nodeA# lockfs -a -f

Confirm that the logical host names lhost-reprg-prim and lhost-reprg-sec are online.

nodeA# clresourcegroup status
nodeC# clresourcegroup status

Examine the state field of the resource group.

Enable remote mirror replication from the
primary cluster to the secondary cluster.

This step enables replication
from the master volume on the primary cluster to the master volume on the
secondary cluster. In addition, this step enables replication to the remote
mirror bitmap on vol04.

If the primary cluster and secondary cluster are unsynchronized,
run this command:

This step enables the master volume on the primary cluster to be copied
to the shadow volume on the same cluster. The master volume, shadow volume,
and point-in-time bitmap volume must be in the same device group. In this
example, the master volume is vol01, the shadow volume
is vol02, and the point-in-time bitmap volume is vol03.

This step associates the point-in-time snapshot with the remote mirror
volume set. Sun StorageTek Availability Suite software ensures that a point-in-time snapshot is
taken before remote mirror replication can occur.

The primary cluster detects the presence of the secondary cluster and
starts synchronization. Refer to the system log file /var/opt/SUNWesm/ds.log for Sun StorEdge Availability Suite or /var/adm for Sun StorageTek Availability Suite for
information about the status of the clusters.

Next Steps

Example of How to Perform Data Replication

This section describes how data replication is performed for the example
configuration. This section uses the Sun StorageTek Availability Suite software commands sndradm and iiadm. For more information about these
commands, see the Sun StorageTek Availability Suite documentation.

In replicating mode, the state is replicating, and
the active state of autosynchronization is on. When the
primary volume is written to, the secondary volume is updated by Sun StorageTek Availability Suite software.

Next Steps

How to Perform a Point-in-Time Snapshot

In this procedure, point-in-time snapshot is used to synchronize the
shadow volume of the primary cluster to the master volume of the primary cluster.
The master volume is vol01, the bitmap volume is vol04, and the shadow volume is vol02.

In replicating mode, the state is replicating, and
the active state of autosynchronization is on. When the
primary volume is written to, the secondary volume is updated by Sun StorageTek Availability Suite software.

If the primary cluster is not in replicating mode, put it into
replicating mode.

See Also

Example of How to Manage a Failover

This section describes how to provoke a failover and how the application
is transferred to the secondary cluster. After a failover, update the DNS
entries. For additional information, see Guidelines for Managing a Failover.