This document explains the step by step process of adding
a Node to the 11g R1 RAC Cluster. In this process, I am going to add a single
node (node2-pub) to single node RAC cluster online without affecting the
availability of the Existing RAC Database.

Oracle 11g
R1 on CentOS EL 4 Update 5 requires below extra packages to be installed with
the version same or higher than listed below.

binutils-2.15.92.0.2-18

compat-libstdc++-33.2.3-47.3

elfutils-libelf-0.97-5

elfutils-libelf-devel-0.97.5

glibc-2.3.9.4-2.19

glibc-common-2.3.9.4-2.19

glibc-devel-2.3.9.4-2.19

gcc-3.4.5-2

gcc-c++-3.4.5-2

libaio-devel-0.3.105-2

libaio-0.3.105-2

libgcc-3.4.5

libstdc++-3.4.5-2

libstdc++-devel-3.4.5-2

make-3.80-5

sysstat-5.0.5

unixODBC-2.2.11

unixODBC-devel-2.2.11

iscsi-initiator-utils-4.0.3.0-5

Configuring
Public and Private Networks:

Each New
node in the cluster must have 3 network adapter (eth0, eth1and eth2) one for
the public, second one for the private network interface (inter-node
communication, interconnect) and third one for the Network Storage System
(Private).

Follow the
below steps to configure these networks:

(1)Keep the hostname same
way as existing node using below command:

hostname node2-pub.hingu.net

(2)Edit the /etc/hosts
file as shown below:

# Do
not remove the following line, or various programs

# that
requires network functionality will fail.

127.0.0.1localhost.localdomain
localhost

## Public Node names

192.168.10.11node1-pub.hingu.netnode1-pub

192.168.10.22node2-pub.hingu.netnode2-pub

## Private Network
(Interconnect)

192.168.0.11node1-prvnode1-prv

192.168.0.22node2-prvnode2-prv

## Private Network
(Network storage)

192.168.1.11node1-nasnode1-nas

192.168.1.22node2-nasnode2-nas

192.168.1.33nas-servernas-server

## Virtual IPs

192.168.10.111node1-vip.hingu.netnode1-vip

192.168.10.222node2-vip.hingu.netnode2-vip

(3)Edit the
/etc/sysconfig/network-scripts/ifcfg-eth0 as shown below:

DEVICE=eth0

BOOTPROTO=none

IPADDR=192.168.10.22

HWADDR=00:06:5B:AE:AE:7F

ONBOOT=yes

TYPE=Ethernet

(4)Edit the
/etc/sysconfig/network-scripts/ifcfg-eth1 as shown below: -<-- For Cluster
interconnects

DEVICE=eth1

BOOTPROTO=static

HWADDR=00:13:46:6A:FC:6D

ONBOOT=yes

IPADDR=192.168.0.22

NETMASK=255.255.255.0

TYPE=Ethernet

(5)Edit the
/etc/sysconfig/network-scripts/ifcfg-eth2 on RAC Nodes as shown below: <--
For iSCSI SAN Storage Network

DEVICE=eth2

ONBOOT=yes

BOOTPROTO=static

IPADDR=192.168.1.22

NETMASK=255.255.255.0

HWADDR=00:18:F8:0F:0D:C1

(6)Edit the
/etc/sysconfig/network file with the below contents:

NETWORKING=yes

HOSTNAME=node2-pub.hingu.net

(7)Restart the network service:

Service network
restart

Memory
and Swap Space:

Oracle 11g R1 RAC
requires to have 1GB of RAM available on each node.

Kernel
parameters:

Oracle
recommends that you set shared memory segment attributes as well as semaphores
to the following values. If not set, database instance creation may fail. I
added the following lines to /etc/sysctl.conf file. Every OS process needs
semaphore where it waits on for the resources.

NOTE: If the current value for any parameter
is higher than the value listed in this table, then do not change the value of
that parameter.

Below commands
gets us the value of the current kernel parameters set in the system.

/sbin/sysctl -a |
grep sem--
for semmsl, semmns, semopm, semmni

/sbin/sysctl -a |
grep shm-- for shmall, shmmax,
shmmni

/sbin/sysctl -a |
grep file-max

/sbin/sysctl -a |
grep ip_local_port_range

/sbin/sysctl -a |
grep rmem_default

Please
add/change the appropriate variables value in the /etc/sysctl.conf file as
shown below.

#
Kernel sysctl configuration file for Red Hat Linux

#

# For
binary values, 0 is disabled, 1 is enabled.See sysctl(8) and

# sysctl.conf(5)
for more details.

# Controls IP packet
forwarding

net.ipv4.ip_forward =
0

# Controls source
route verification

net.ipv4.conf.default.rp_filter
= 1

# Do
not accept source routing

net.ipv4.conf.default.accept_source_route
= 0

# Controls the System
Request debugging functionality of the kernel

kernel.sysrq = 0

# Controls whether
core dumps will append the PID to the core filename.

#
Useful for debugging multi-threaded applications.

kernel.core_uses_pid
= 1

# Extra parameters For 11g RAC installation

kernel.shmmax =
2147483648

kernel.shmmni = 4096

kernel.shmall =
2097152

kernel.sem = 250
32000 100 128

fs.file-max
= 6553600

net.ipv4.ip_local_port_range
= 1024 65000

net.core.rmem_default
= 4194304

net.core.wmem_default
= 262144

net.core.wmem_max=
262144

net.core.rmem_max =
4194304

After
adding these lines to /etc/sysctl.conf, please run the below command as root to make them
enabled.

sysctl -p

Creating
oracle OS User Account:

Get the value
of user id and group id of oracle user on existing node by executing the id command and provide
the same IDs to the below set of command to create oracle user and groups on
new node.

groupadd
-g 900 dba

groupadd
-g 901 oinstall

useradd
-u 900 -g oinstall -G dba oracle

passwd
oracle

id
oracle

Creating
Oracle Software Directories:

Create the
similar directories with ownership and permissions for all the ORACLE_HOMEs on
the new node as existing node.

mkdir
-p /u01/app/crs

mkdir
-p /u01/app/asm

mkdir
-p /u01/app/oracle

mkdir
-p /u02/ocfs2

chown
-R oracle:oinstall /u01

chown
-R oracle:oinstall /u02

chmod
-R 775 /u01/app/oracle

chmod
-R 775 /u01

Setting
Shell Limits for the Oracle User:

Add the following lines to the
/etc/security/limits.conf file:

oracle
soft nproc 2047

oracle
hard nproc 16384

oracle
soft nofile 1024

oracle
hard nofile 65536

Add or edit the following line
in the /etc/pam.d/login file, if it does not already exist:

session
required /lib/security/pam_limits.so

For the Bourne, Bash, or Korn
shell, add the following lines to the /etc/profile:

if
[ $USER = "oracle" ]; then

if
[ $SHELL = "/bin/ksh" ]; then

ulimit
-p 16384

ulimit
-n 65536

else

ulimit
-u 16384 -n 65536

fi

fi

For the C shell (csh or tcsh),
add the following lines to the /etc/csh.login.

if
( $USER == "oracle" ) then

limit
maxproc 16384

limit
descriptors 65536

endif

Enable
SSH oracle user Equivalency on Both the Cluster Nodes:

On New
Node:

su
- oracle

mkdir
~/.ssh

chmod
700 ~/.ssh

Generate the RSA and DSA keys:

/usr/bin/ssh-keygen
-t rsa

/usr/bin/ssh-keygen
-t dsa

On node1:

cd
~/.ssh

scpconfig node2-pub:.ssh

scp authorized_keys node2:.ssh/

On node2:

(a)Add the
Keys generated above to the Authorized_keys file.

cd ~/.ssh

cat id_rsa.pub >> authorized_keys

cat id_dsa.pub >> authorized_keys

(b)Send this file
to node1.

scp authorized_keys node1-pub:.ssh/

chmod 600 authorized_keys

On both the Nodes:

chmod 600 ~/.ssh/authorized_keys

ssh node1-pub date

ssh node2-pub date

ssh node1.hingu.net date

ssh node2.hingu.net date

ssh node3.hingu.net date

ssh node1-prv date

ssh node2-prv date

Entered
'yes' and continued when prompted

If you get then below error
message when try to connect to remote node, please make sure that the firewall
is disabled on the remote node.

(5)Map the Volumes on
the iscsi-target (nas-server) to the Disks discovered on the local RAC nodes.

Host ID

Target ID

Discovered as

0

iqn.2006-01.com.openfiler:rac11g.ocfs-dsk

1

iqn.2006-01.com.openfiler:rac11g.asm-dsk4

2

iqn.2006-01.com.openfiler:rac11g.asm-dsk3

3

iqn.2006-01.com.openfiler:rac11g.asm-dsk2

4

iqn.2006-01.com.openfiler:rac11g.asm-dsk1

Now, run the below command to find out the
"Attached" devices to the Host IDs. The scsi
Id in this output maps to the Host ID on the "iscsi-ls" output.

[root@node2-pub
rpms]# dmesg | grep Attached

Attached scsi disk sda at scsi0, channel 0, id 0, lun 0

Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0

Attached scsi disk sdc at scsi2, channel 0, id 0, lun 0

Attached scsi disk sdd at scsi3, channel 0, id 0, lun 0

Attached scsi disk sde at scsi4, channel 0, id 0, lun 0

In first
line, scsi0 (Host Id 0) has device "sda" attached to it. So, By filling the above table with this information gives the
mapping of discovered Disks at client to its actual Volumes on the
iscsi-target.

Host ID

Target ID

Discovered as

0

iqn.2006-01.com.openfiler:rac11g.ocfs-dsk

sda

1

iqn.2006-01.com.openfiler:rac11g.asm-dsk4

sdb

2

iqn.2006-01.com.openfiler:rac11g.asm-dsk3

sdc

3

iqn.2006-01.com.openfiler:rac11g.asm-dsk2

sdd

4

iqn.2006-01.com.openfiler:rac11g.asm-dsk1

sde

No need
to partition the Shared Disks on new Node. After successful discovery of the
Shared Volumes, as shown above, all the existing partitions on these volumes
will be available on node2-pub node. Verify with fdisk –l.

Make sure to add exact same node
name as it has been returned by the `hostname` command. The hostname for the
new node is node2-pub.hingu.net.

Propagate these changes to all
the nodes in the cluster as shown below.

The above settings of
"Name" (node2-nas) in the ocfs2 configuration caused the below error
try to enable o2cb service.

[root@node2-pub
rpms]# /etc/init.d/o2cb enable

Writing O2CB
configuration: OK

Starting O2CB cluster
ocfs2: Failed

Cluster ocfs2 created

Node node1-nas added

Node node2-nas added

o2cb_ctl: Configuration
error discovered while populating cluster ocfs2.None of its nodes were considered local.A node is considered local when its node name
in the configuration matches this machine's host name.

Stopping O2CB cluster
ocfs2: OK

Stop o2cb service, open the /etc/ocfs2/cluster.conf
file and update the hostname value to the one that is returned by `hostname`
command. Then, start the service and load it again and this time error should
go away.

/etc/ocfs2/cluster.conf:

node:

ip_port = 7777

ip_address = 192.168.0.11

number = 0

name =
node1-pub.hingu.net

cluster =
ocfs2

node:

ip_port = 7777

ip_address = 192.168.0.22

number = 1

name =
node2-pub.hingu.net

cluster =
ocfs2

cluster:

node_count = 2

name = ocfs2

Load the o2cb and
Start configuring OCFS2.

/etc/init.d/o2cb load

/etc/init.d/o2cb
status

/etc/init.d/o2cb
configure

chkconfig
--add ocfs2

chkconfig
--add o2cb

mkdir
-p /u02/ocfs2

[root@node2-pub rpms]#
/etc/init.d/o2cb configure

Configuring
the O2CB driver.

This will configure
the on-boot properties of the O2CB driver.

The following
questions will determine whether the driver is loaded on

boot.The current values will be shown in brackets ('[]').Hitting

<ENTER> without
typing an answer will keep that current value.Ctrl-C

will
abort.

Load O2CB driver on
boot (y/n) [n]: y

Cluster to start on
boot (Enter "none" to clear) [ocfs2]:

Specify heartbeat
dead threshold (>=7) [7]:

Specify network idle
timeout in ms (>=5000) [10000]:

Specify network
keepalive delay in ms (>=1000) [5000]:

Specify network
reconnect delay in ms (>=2000) [2000]:

Writing O2CB
configuration: OK

Starting O2CB cluster
ocfs2: OK

Mount the filesystem:

mount
-t ocfs2 -o datavolume,nointr /dev/ocfs2 /u02/ocfs2

The below error may be
seen at this point.

mount.ocfs2:
Transport endpoint is not connected while mounting /dev/sda1 on /u02/ocfs2.
Check 'dmesg' for more information on this error.

The possible solution is
to disable the SELinux and Firewall on the new node (which has already been
disabled above)

Update the /etc/fstab:

# This
file is edited by fstab-sync - see 'man fstab-sync' for details