The Solaris Cluster Manager GUI package are included in ha-cluster-full package group. But can also be installed with # pkg install –-accept ha-cluster/system/manager
if used different package groups within Solaris Cluster installation.

This procedure is especially for 'live' migration of guest LDom's which means no shutdown of the OS in the LDom within the failover. In earlier OVM releases this was called 'warm' migration. However, the word 'live' is used in this example. A 'cold' migration means that the OS in the guest LDom will be stopped before migration.

Let's start:
The necessary services must be identical on all the potential control domains (primary domains) which run as Solaris Cluster 4.1 nodes. It is expected that Oracle VM Sever is already installed.

6) Select boot device for failover guest domain fgd0.
Possible options for the root file system of a domain with 'live' migration are: Solaris Cluster global filesystem (UFS/SVM), NFS, iSCSI, and SAN LUNs because all accessible at the same time from both nodes. The recommendation is to use full raw disk because it's expected to do 'live' migration. The full raw disk can be provided via SAN or iSCSI to all primary domains.
Remember zfs as root filesystem can ONLY be used if doing 'cold' migration because for 'live' migration both nodes need to access the root file system at the same time which is not possible with zfs.
Using Solaris Cluster global filesystem is an alternative but the performance is not that good as root on raw disk.
Details available in DocID 1366967.1 Solaris Cluster Root Filesystem Configurations for a Guest LDom Controlled by a SUNW.ldom Resource

17) Setup password file for non-interactive 'live' migration on all primary nodesall_primaries# vi /var/cluster/.pwfgd0
add root password to this fileall_primaries# chmod 400 /var/cluster/.pwfgd0
Reguirements:
* The first line of the file must contain the password
* The password must be plain text
* The password must not exceed 256 characters in length
A newline character at the end of the password and all lines that follow the first line are ignored.
These details from Performing Non-Interactive Migrations of Oracle VM Server for SPARC 3.0 Administration Guide

17a) Alternative: Setup encrypted password file for non-interactive 'live' migration on all primary nodesall_primaries# echo "encrypted" > /var/cluster/.pwfgd0
all_primaries# dd if=/dev/urandom of=/var/cluster/ldom_key bs=16 count=1
all_primaries# chmod 400 /var/cluster/ldom_key
all_primaries# echo fu_bar | /usr/sfw/bin/openssl enc -aes128 -e -pass file:/var/cluster/ldom_key -out /opt/SUNWscxvm/.fgd0_passwd
all_primaries# chmod 400 /opt/SUNWscxvm/.fgd0_passwd
The root password for failover LDom is "fu_bar" which will be encrypted. All files must be secured using "chmod 400". Both /var/cluster/ldom_key and /opt/SUNWscxvm/.{DOMAIN}_passwd file can NOT be placed in a different location and can NOT have a different name.

Notice: The domain configuration is retrieved by the “ldm list-constraints -x ldom” command from Solaris Cluster and stored in the CCR. This info is used to create or destroy the domain on the node where the resource group is brought online or offline.

19) Check Migration_type property. It should be MIGRATE for 'live' migration:primaryA# clrs show -v fgd0-rs | grep Migration_type
If not MIGRATE then set it:primaryA# clrs set -p Migration_type=MIGRATE fgd0-rs

Check if zone cluster can be created # cluster show-netprops
to change number of zone clusters use
# cluster set-netprops -p num_zoneclusters=12
Note: 12 zone clusters is the default, values can be customized!

Create config file (zc1config) for zone cluster setup e.g:

Configure zone cluster# clzc configure -f zc1config zc1
Note: If not using the config file the configuration can also be done manually # clzc configure zc1

Check zone configuration# clzc export zc1

Verify zone cluster# clzc verify zc1
Note: The following message is a notice and comes up on several clzc commands
Waiting for zone verify commands to complete on all the nodes of the zone cluster "zc1"...

Install the zone cluster# clzc install zc1
Note: Monitor the consoles of the global zone to see how the install proceed! (The output is different on the nodes) It's very important that all global cluster nodes have installed the same set of ha-cluster packages!

Appendix: To delete a zone cluster do:
# clrg delete -Z zc1 -F +
Note: Zone cluster uninstall can only be done if all resource groups are removed in the zone cluster. The command 'clrg delete -F +' can be used in zone cluster to delete the resource groups recursively.# clzc halt zc1
# clzc uninstall zc1
Note: If clzc command is not successful to uninstall the zone, then run 'zoneadm -z zc1 uninstall -F' on the nodes where zc1 is configured# clzc delete zc1

I like to mention this because zfs is used more and more in Solaris Cluster environments. Therefore I highly recommend to install following patches to get a more reliable Solaris Cluster environment in combination with zpools on SC3.3 and SC3.2. So, if you already running such a setup, start planning NOW to install the following patch revision (or higher) for your environment...

Wednesday Aug 10, 2011

The package version of the Oracle Solaris Cluster 3.3 05/11 Update1 are the same for the core framework and the agents as for Oracle Solaris Cluster 3.3. Therefore it's possible to patch up an existing Oracle Solaris Cluster 3.3.

The package version of the Oracle Solaris Cluster Geographic Edition 3.3 05/11 Update1 are NOT the same as Oracle Solaris Cluster Geographic Edition 3.3. But it's possible to upgrade the Geographic Edition 3.3 without interruption of the service. See documentation for details.

The following patches (with the mentioned revision) are included/updated in Oracle Solaris Cluster 3.3 05/11 Update1. If these patches are installed on Oracle Solaris Cluster 3.3 release, then the features for framework & agents are identical with Oracle Solaris Cluster 3.3 05/11 Update1. It's always necessary to read the "Special Install Instructions of the patch" but I made a note behind some patches where it's very important to read the "Special Install Instructions of the patch" (Using shortcut SIIOTP).

The quorum server is an alternative to the traditional quorum disk. The quorum server is outside of the Oracle Solaris Cluster and will be accessed through the public network. Therefore the quorum server can be a different architecture.
Note: The quorum server software is only required on the quorum server and NOT on the Solaris Cluster nodes which are using the quorum server.

Tuesday Mar 15, 2011

Maybe there is a need (for whatever reason) to configure a local zpool in a Solaris Cluster environment. As local zpool I mean that this zpool should only be available on one Solaris Cluster node WITHOUT using SUNW.HAStoragePlus. Such a local zpool can be configured with local devices (only connected to one node) or shared devices (accessible from all nodes in the cluster via SAN). However in case of shared device it would be better to setup a zone in the SAN switch to make the device only available to one host.

The following procedure is necessary to use local devices in local zpool:

In this example I use the local device c1t3d0 to create a local zpool
a) Look for the did device of the device which should be used by the zpool# scdidadm -l c1t3d0
49 node0:/dev/rdsk/c1t3d0 /dev/did/rdsk/d49
b) Check the settings of the used did device# cldg show dsk/d49
Note: Only one node should be in the node list
c) Set localonly flag for the did device. Optional: set autogen flag # cldg set -p localonly=true -p autogen=true dsk/d49
or disable fencing for the did device
# cldev set -p default_fencing=nofencing d49
d) Verify the settings# cldg show dsk/d49
e) Create the zpool # zpool create localpool c1t3d0
# zfs create localpool/data

The following procedure is necessary to use shared devices in local zpool:

In this example I use the shared device c6t600C0FF00000000007BA1F1023AE1711d0 to create a local zpool
a) Look for the did device of the device which should be used by the zpool# scdidadm -L c6t600C0FF00000000007BA1F1023AE1711d0
11 node0:/dev/rdsk/c6t600C0FF00000000007BA1F1023AE1710d0 /dev/did/rdsk/d11
11 node1:/dev/rdsk/c6t600C0FF00000000007BA1F1023AE1710d0 /dev/did/rdsk/d11
b) Check the settings of the used did device# cldg show dsk/d11
c) Remove the node which should not access the did device# cldg remove-node -n node1 dsk/d11
d) Set localonly flag for the did device. Optional: set autogen flag # cldg set -p localonly=true -p autogen=true dsk/d11
or disable fencing for the did device
# cldev set -p default_fencing=nofencing d11
e) Verify the settings# cldg show dsk/d11
f) Create the zpool # zpool create localpool c6t600C0FF00000000007BA1F1023AE1711d0
# zfs create localpool/data

If you forgot to do this for a local zpool then there is a possibility that the zpool will be FAULTED state after a boot.

Tuesday Aug 17, 2010

There was a rejuvenation of the Solaris Cluster 3.2 core patch. The new patches are

144220 Solaris Cluster 3.2: CORE patch for Solaris 9144221 Solaris Cluster 3.2: CORE patch for Solaris 10144222 Solaris Cluster 3.2: CORE patch for Solaris 10_x86
At this time these patches does NOT have the requirement to be installed in non-cluster-single-user-mode. They can be installed in order when cluster is running, but requires a reboot.

Beware the new patches requires the previous version -42 of the SC 3.2 core patch.126105-42 Sun Cluster 3.2: CORE patch for Solaris 9126106-42 Sun Cluster 3.2: CORE patch for Solaris 10126107-42 Sun Cluster 3.2: CORE patch for Solaris 10_x86
And the -42 still have the requirement to be installed in non-cluster-single-user-mode. Furthermore carefully study the special install instructions and some entries of this blog.

The advantage is, when -42 is already applied then the patching of Solaris Cluster 3.2 becomes more easy.

Certainly, it's possible to apply the new SC core patch at the same time as the -42 core patch in non-cluster-single-user-mode.

Friday Mar 26, 2010

My last blog describe some issues around these patches (please read)126106-40 Sun Cluster 3.2: CORE patch for Solaris 10126107-40 Sun Cluster 3.2: CORE patch for Solaris 10_x86
This is a follow up with a summary of best practices 'How to install?' these patches. There is a difference between new installations, 'normal' patching and live upgrade patching.Important: The mentioned instructions are working if already Solaris Cluster 3.2 1/09 update2 (or Solaris Cluster 3.2 core patch revision -27(sparc) / -28(x86) ) or higher is installed. If running lower version of the Solaris Cluster 3.2 core patch then additional needs are necessary. Please refer to special install instructions of the patches for the additional needs.Update: 28.Apr 2010
This also apply to the already released -41 and -42 SC core patches, when -40 is not already active

A) In case of new installations:

Install the SC core patch -40 immediately after the installation of the Solaris Cluster 3.2 software.
In brief:
1.) Install Solaris Cluster 3.2 via JES installer
2.) Install the SC core patch -40
3.) Run scinstall
4.) Do the reboot
Note: Do NOT do a reboot between 1.) and 2.). Follow the EIS Solaris Cluster 3.2 checklist which also has a note for this issue. If not available follow the standard installation process of Sun Cluster 3.2

B) In case of 'normal' patching

It is vital to use the following/right approach in case of patching. Because if you not use the following approach then the Solaris Cluster 3.2 can not boot anymore:
0.) Only if using AVS 4.0
# patchadd 12324[67]-05 (Follow Special Install Instructions)
1.) # boot in non-cluster mode
2.) # svcadm disable svc:/system/cluster/loaddid
3.) # svccfg delete svc:/system/cluster/loaddid
4.) # patchadd 12610[67]-40
5.) # init 6

My personal recommendation to minimize the risk of the installation for the SC core patch -40 is:Step 1) Upgrade the Solaris Cluster to
a) Solaris 10 10/09 update 8 and Solaris Cluster 3.2 11/09 update3.
or
b) EIS Baseline 26JAN10 which include the Solaris kernel update 14144[45]-09 and SC core patch -39. If EIS baseline not available use other patchset which include the mentioned patches.Step 2) After the successful upgrade do a single patch install of the SC core patch -40 by using the installation instruction B) which is mentioned above. In this software state the -40 can be applied 'rolling' to the cluster.

Wednesday Mar 03, 2010

This is a notify because there are some troubles around with the following Sun Cluster 3.2 -40 core patches:126106-40 Sun Cluster 3.2: CORE patch for Solaris 10126107-40 Sun Cluster 3.2: CORE patch for Solaris 10_x86
Before installing the patch carefully read the Special Install Instructions.Update: 28.Apr 2010
This also apply to the already released -41 and -42 SC core patches, when -40 is not already active

Two new notes where added to these patches:

NOTE 16: Remove the loaddid SMF service by running the following
commands before installing this patch, if current patch level
(before installing this patch) is less than -40:
svcadm disable svc:/system/cluster/loaddid
svccfg delete svc:/system/cluster/loaddid

NOTE 17: Installing this patch on a machine with Availability Suite
software installed will cause the machine to fail to boot with
dependency errors due to BugId 6896134 (AVS does not wait for
did devices to startup in a cluster). Please contact your Sun
Service Representative for relief before installing this patch.

Important to know: These 2 issues only come up if using Solaris 10 10/09 Update8 or the Kernel patch 141444-09 or higher. There are changes in the startup of the iSCSI initiator (is now a SMF service) - please refer to 6888193 for details.

But this should NOT be problem because the patch 126106-40 will be installed in non-cluster-mode. This means after the next boot into cluster mode the error should disappear. This is reported in Bug 6911030.

But to be sure that the system is booting correctly do:
- check the log file /var/svc/log/system-cluster-loaddid:default.log
- that one of the last lines is: [ Mar 16 10:31:15 Rereading configuration. ]
- if not go to 'Recovery procedure below'

2) What happen if doing the loaddid delete after the patch installation?

1) boot in non-cluster-mode if not able to login
2) bring the files loaddid and loaddid.xml in place (normally using the files from SC core patch -40)
ONLY in case of trouble with the files from SC core patch -40 use the old files!
Note: If restore old file without the dependency to iSCSI initiator then there can be problems if trying to use iSCSI storage within Sun Cluster.
3) Repair loaddid service
# svcadm disable svc:/system/cluster/loaddid
# svccfg delete svc:/system/cluster/loaddid
# svccfg import /var/svc/manifest/system/cluster/loaddid.xml
# svcadm restart svc:/system/cluster/loaddid:default
4) check the log file /var/svc/log/system-cluster-loaddid:default.log# tail /var/svc/log/system-cluster-loaddid:default.log
for the following line (which should be on the end of the log file)[ Mar 16 11:43:06 Rereading configuration. ]
Note: Rereading configuration is necessary before booting!
5) reboot the system # init 6

Advantages:
-- The 3rd mediator host is used to get the majority of Solaris Volume Manager configuration when one room is lost due to an error.
-- The 3rd mediator host only need public network connection. (Must not be part of the cluster and need no connection to the shared storage).Consider:
-- One more room necessary
-- One more host for administration, but if using a Sun Cluster quorum server then the 3rd mediator can be on the same host.

This example shows how to add a 3rd mediator host to an existing Solaris Volume Manager diskset

Note: Maybe a good name for the dummyds can be a combination of the used setname on the campuscluster name e.g.:'setnameofcampuscluster_campusclustername'. If using more than one set it could be e.g: 'sets_of_campusclustername'. Or if using it for more than one cluster it's possible to create one set with a specific name for each cluster. This could be helpful for monitoring/configuration purposes. But keep in mind this is not required, one set is enough for all clusters which use this 3rd mediator host.

Saturday Dec 19, 2009

This is an early notify because there are some troubles around with the following Sun Cluster 3.2 quorum server patches:127404-03 Sun Cluster 3.2: Quorum Server Patch for Solaris 9127405-04 Sun Cluster 3.2: Quorum Server patch for Solaris 10127406-04 Sun Cluster 3.2: Quorum Server patch for Solaris 10_x86
All these patches are part of Sun Cluster 3.2 11/09 Update3 release, but also available on My Oracle Support.

These patches delivers new features which requires attention in case of upgrade or patching. The installation of the mentioned patches on a Sun Cluster 3.2 quorum server can lead to a panic of all Sun Cluster 3.2 nodes which use this quorum server. The panic of Sun Cluster 3.2 nodes are as follows:
...
Dec 4 16:43:57 node1 \^Mpanic[cpu18]/thread=300041f0700:
Dec 4 16:43:57 node1 unix: [ID 265925 kern.notice] CMM: Cluster lost operational quorum; aborting.
...

As stated in my last blog the following note of the Special Install Instructions of Sun Cluster 3.2 core patch -38 and higher is very important.NOTE 17: Quorum server patch 127406-04 (or greater) needs to be installed on quorum server host first, before installing 126107-37 (or greater) Core Patch on cluster nodes.
This means if using a Sun Cluster 3.2 quorum server then it's necessary to upgrade the quorum server before upgrade the Sun Cluster 3.2 nodes to Sun Cluster 3.2 11/09 update3 which use the quorum server. AND furthermore the same apply in case of patching. If installing the Sun Cluster core patch -38 or higher (-38 is part of Sun Cluster 3.2 11/09 update3)126106-38 Sun Cluster 3.2: CORE patch for Solaris 10126107-38 Sun Cluster 3.2: CORE patch for Solaris 10_x86126105-38 Sun Cluster 3.2: CORE patch for Solaris 9
then the same rule apply. First update the quorum server and then the Sun Cluster nodes. Please refer to details above how to do it...
For upgrade also refer to the document: How to upgrade quorum server software

Keep in mind: Fresh installations with Sun Cluster 3.2 11/09 update3 on the Sun Cluster nodes and on the quorum server are NOT affected!

Sunday Dec 06, 2009

The package version of the Sun Cluster 3.2 11/09 Update3 are the same for the core framework and the agents as for Sun Cluster 3.2, Sun Cluster 3.2 2/08 Update1 and Sun Cluster 3.2 1/09 Update2. Therefore it's possible to patch up an existing Sun Cluster 3.2, Sun Cluster 3.2 2/08 Update1 or Sun Cluster 3.2 1/09 Update2.

The package version of the Sun Cluster Geographic Edition 3.2 11/09 Update3 are NOT the same as Sun Cluster Geographic Edition 3.2. But it's possible to upgrade the Geographic Edition 3.2 without interruption of the service. See documentation for details.

The following patches (with the mentioned revision) are included/updated in Sun Cluster 3.2 11/09 Update3. If these patches are installed on Sun Cluster 3.2, Sun Cluster 3.2 2/08 Update1 or Sun Cluster 3.2 1/09 Update2 release, then the features for framework & agents are identical with Sun Cluster 3.2 11/09 Update3. It's always necessary to read the "Special Install Instructions of the patch" but I made a note behind some patches where it's very important to read the "Special Install Instructions of the patch" (Using shortcut SIIOTP).

The quorum server is an alternative to the traditional quorum disk. The quorum server is outside of the Sun Cluster and will be accessed through the public network. Therefore the quorum server can be a different architecture.

If some patches must be applied when the node is in noncluster mode, you can apply them in a rolling fashion, one node at a time, unless a patch's instructions require that you shut down the entire cluster. Follow procedures in How to Apply a Rebooting Patch (Node) in Sun Cluster System Administration Guide for Solaris OS to prepare the node and boot it into noncluster mode. For ease of installation, consider applying all patches at once to a node that you place in noncluster mode.

2.) The patch breaks probe-based IPMP if more than one interface is in the same IPMP group

After installing the already mentioned kernel patch:141444-09 SunOS 5.10: kernel patch or141445-09 SunOS 5.10_x86: kernel patch
then the probe-based IPMP group feature is broken if the system is using more than one interface in the same IPMP group. This means all Solaris 10 systems which are using more than one interface in the same probe-based IPMP group are affected!

After installing this kernel patch the following errors will be sent to the system console after a reboot:
...
nodeA console login: Oct 26 19:34:41 in.mpathd[210]: NIC failure detected on bge0 of group ipmp0
Oct 26 19:34:41 in.mpathd[210]: Successfully failed over from NIC bge0 to NIC e1000g0
...

Workarounds:
a) Use link-based IPMP instead of probe-based IPMP
b) Use only one interface in the same IPMP group if using probe-based IPMP
See the blog "Tips to configure IPMP with Sun Cluster 3.x" for more details if you like to change the configuration.
c) Do not install the listed kernel patch above. Note: Fix is already in progress and can be reached via a service request. I will update this blog when the general fix is available.

After installing the already mentioned kernel patch:141444-09 SunOS 5.10: kernel patch or141511-05 SunOS 5.10_x86: ehci, ohci, uhci patch
the Sun Cluster nodes can hang within boot because the Sun Cluster nodes has exhausted the default number of autopush structures. When clhbsndr module is loaded, it causes a lot more autopushes to occur than would otherwise happen on a non-clustered system. By default, we only allocate nautopush=32 of these structures.

Workarounds:
a) Do not use the mentioned kernel patch with Sun Cluster
b) Boot in non-cluster-mode and add the following to /etc/system
set nautopush=64

Check if zone cluster can be created # cluster show-netprops
to change number of zone clusters use
# cluster set-netprops -p num_zoneclusters=12
Note: 12 zone clusters is the default, values can be customized!

Configure zone cluster# clzc configure -f zc1config zc1
Note: If not using the config file the configuration can also be done manually # clzc configure zc1

Check zone configuration# clzc export zc1

Verify zone cluster# clzc verify zc1
Note: The following message is a notice and comes up on several clzc commands
Waiting for zone verify commands to complete on all the nodes of the zone cluster "zc1"...

Install the zone cluster# clzc install zc1
Note: Monitor the console of the global zone to see how the install proceed!

Appendix: To delete a zone cluster do:
# clrg delete -Z zc1 -F +
Note: Zone cluster uninstall can only be done if all resource groups are removed in the zone cluster. The command 'clrg delete -F +' can be used in zone cluster to delete the resource groups recursively.# clzc halt zc1
# clzc uninstall zc1
Note: If clzc command is not successful to uninstall the zone, then run 'zoneadm -z zc1 uninstall -F' on the nodes where zc1 is configured# clzc delete zc1

Monday Sep 07, 2009

In some cases it's necessary to add a tagged VLAN id to the cluster interconnect. This example show the difference of the cluster interconnect configuration if using tagged VLAN id or not. The interface e1000g2 have a "normal" setup (no VLAN id) and the interface e1000g1 got a VLAN id of 2. The used ethernet switch must be configured first with tagged VLAN id before the cluster interconnect can be configured. Use "clsetup" to assign a VLAN id to cluster interconnect.

The tagged VLAN interface is a combination of the VLAN id and the used network interface. In this example e1000g2001, the 2 after the e1000g is the VLAN id and the 1 at the end is the instance of the e1000g driver. Normally this would be the e1000g1 interface but with the VLAN id it becomes the interface e1000g2001.

Thursday Jul 23, 2009

I wrote together a quick reference guide for Sun Cluster 3.x. The guide includes the "old" command line which is used for Sun Cluster 3.0, 3.1, 3.2 and the already known Sun Cluster 3.2 new object based command line. Please do not expect the whole command line in this two pages. It should be a reminder for the most used commands within Sun Cluster 3.x. I added the pictures to this blog but also the pdf file is available for download.

Friday May 08, 2009

Carefully configure zpools in Sun Cluster 3.2. Because it's possible to use the same physical device in different zpools on different nodes at the same time. This means the zpool command does NOT care about if the physical device is already in use by another zpool on another node. e.g. If node1 have an active zpool with device c3t3d0 then it's possible to create a new zpool with c3t3d0 on another node. (assumption: c3t3d0 is the same shared device on all cluster nodes).

Output of testing...

If problems occurred due to administration mistakes then the following errors have been seen:

NODE1# zpool import tank
cannot import 'tank': I/O error

NODE2# zpool import tankothernode
cannot import 'tankothernode': one or more devices is currently unavailable

NODE1# zpool import tank
cannot import 'tank': pool may be in use from other system, it was last accessed by NODE2 (hostid: 0x83083465) on Fri May 8 13:34:41 2009
use '-f' to import anyway
NODE1# zpool import -f tank
cannot import 'tank': one or more devices is currently unavailable

Furthermore the zpool command also use the disk without any warning if it used by Solaris Volume Manager diskset or Symantec (Veritas) Volume Manager diskgroup.

Summary for Sun Cluster environment:
ALWAYS MANUALLY CHECK THAT THE DEVICE WHICH USING FOR ZPOOL IS FREE!!!

How the problem occur?
After the installation of Sun Cluster 3.2 1/09 Update2 product with the java installer it's necessary to run the #scinstall command. If choose "Custom" installation instead of "Typical" installation then it's possible to change the default of the netmask of cluster interconnect. The following questions come up within the installation procedure if answering the default netmask question with 'no'.

Example scinstall:
Is it okay to accept the default netmask (yes/no) [yes]? no
Maximum number of nodes anticipated for future growth [64]? 4
Maximum number of private networks anticipated for future growth [10]? Maximum number of virtual clusters expected [12]? 0
What netmask do you want to use [255.255.255.128]? Prevent the issue by answering the virtual clusters question with '1' or other serious consideration to future growth potential if necessary. Do NOT answer the virtual clusters question with '0'!

Example of the whole scinstall log when corrupted ccr occur:

In the /etc/cluster/ccr/global/infrastructure file the error can be found by an empty entry for cluster.properties.private_netmask. Furthermore some other lines are not reflect the correct values for netmask as choosen within scinstall.Wrong infrastructure file:
cluster.state enabled
cluster.properties.cluster_id 0x49F82635
cluster.properties.installmode disabled
cluster.properties.private_net_number 172.16.0.0cluster.properties.cluster_netmask 255.255.248.0cluster.properties.private_netmask
cluster.properties.private_subnet_netmask 255.255.255.248
cluster.properties.private_user_net_number 172.16.4.0
cluster.properties.private_user_netmask 255.255.254.0
cluster.properties.private_maxnodes 6
cluster.properties.private_maxprivnets 10cluster.properties.zoneclusters 0
cluster.properties.auth_joinlist_type sys

Workaround if problem already occured:
1.) Boot all nodes in non-cluster-mode with 'boot -x'
2.) Change the wrong values of /etc/cluster/ccr/global/infrastructure on all nodes. See example above.
3.) Write a new checksum for all infrastructure files on all nodes. Use -o (master file) on the node which is booting up first.
scnode1 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure -o
scnode2 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure
scnode1 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure
scnode2 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure
4.) first reboot scnode1 (master infrastructure file) into cluster, then the other nodes.
This is reported in bug 6825948.

Update 17.Jun.2009:
The -33 revision of the Sun Cluster core patch is the first released version which fix this issue at installation time.126106-33 Sun Cluster 3.2: CORE patch for Solaris 10 126107-33 Sun Cluster 3.2: CORE patch for Solaris 10_x86