Contents

RAC Attack is a free curriculum and platform for hands-on learning labs related to Oracle RAC (cluster database). We believe that the best way to learn about RAC is with a lot of hands-on experience. This curriculum has been used by individuals at home and by instructors in classes since 2008.

The original contributors were Jeremy Schneider, Dan Norris and Parto Jalili. The handbook was published at http://www.ardentperf.com for several years before its migration to this wikibook. All RAC Attack content was released under the CC-BY-SA license in May 2011 when this project was initiated.

To learn about upcoming RAC Attack events or to organize one yourself, visit the Events page. You can use the shortcut http://racattack.org/events to access this page at any time.

The goal of this workbook is to help students learn about Oracle RAC cluster databases through guided examples. (Specifically, 11gR2 RAC on VMware Server with ASM or Shared Filesystem and Oracle Enterprise Linux 5.) It can be used by organizers of events, by instructors in classes or by individuals at home.

RAC Attack differs in depth from other tutorials currently available.

Every keystroke and mouse click is carefully documented here.

The process is covered from the very beginning to the very end - from the very first installation of VMware on your laptop to various experiments on your running cluster database... with everything in between.

The labs in the main workbook have been tested thoroughly and repeatedly.

For the most benefit, you must plan your time carefully. There will not be enough time to complete all of the labs - so choose the ones which most interest you.

If you are using your own computer at home or at an event, then you always need to complete the first lab (Hardware and Windows Preparation) before you can jumpstart to any following labs. If you are in a class then the instructor has probably completed the first lab for you, and you can begin with a jumpstart.

These times were gathered with a laptop just meeting the recommended minimum requirements. In addition to the wait times listed below, we suggest that you reserve about 40 minutes of work time to complete any given lab.

Downloads only apply to home users. If you are at an event or a class then the organizers have already downloaded the software for you.

If your laptop or desktop does not meet these minimum requirements then it is not recommended to try completing the RAC Attack labs. Although it is possible to complete these labs with smaller configurations, there are many potential problems.

Although we recommend against trying, RAC Attack has been done with: single-core, 3GB memory, one physical hard drive, certain USB flash memory sticks, and less than 60GB of free space.

Scroll down to Memory in the right pane. Verify that Installed Physical Memory is at least 4GB. Also, verify that Available Memory is at least 1.4GB. You can terminate programs which run in the foreground and background to increase the Available Memory.

A single hard disk can max out as low as 45 MB/s. (This has been observed during RAC Attack testing.) Typical USB Flash Thumb Drives get very, very poor performance and should not be used. Some USB Flash Thumb Drives are marketed for performance; these typically get a maximum around 30 MB/s. In tests for RAC Attack, USB drives worked well for storing ISO images but somewhat poorly for storing virtual machine files. For a detailed comparison of different connection types, refer to: http://www.pixelbeat.org/speeds.html

RAC Attack is carefully designed to use three directories and spread out I/O for the best possible responsiveness during labs. You can choose how to spread the directories across your hard disks, and the best configuration may vary depending on your connection and storage type.

Directory Name

Description

Free Space

Suggested Location

RAC11g

Operating System
Oracle RAC Software

50 GB

Second Hard Disk (not flash)

RAC11g-shared

Oracle RAC Data

7.5 GB

Windows Hard Disk*

RAC11g-iso

OEL Installation DVD (read-only)

3 GB

Windows Hard Disk*

*page file is usually on Windows Hard Disk

Note: do not create the RAC11g directory (with OS and Oracle Software) on a Flash Thumb Drive.

We worked hard to reduce the footprint of RAC Attack, however with 11gR2 it's very difficult to reduce it beyond this.

RAC Attack requires a local windows user account with a password and with administrative privileges. You may login using a network or password-free account only if the login account has admin privileges and you know the password for a local account which also has admin privileges (and not an empty password).

If your account is not local, or if your account does not have local admin privileges then you can create an admin account by following the directions here.

Type net user %username% (if you're using a network or password-free login account then replace %username% with the local password-ed admin account). VERIFY the username, VERIFY that password required is yes, and VERIFY that local group memberships include Administrators.

If you are at home, then download VMware Server. If you are at a RAC Attack event then the instructor-provided Jumpstart Drive contains a copy of VMware Server, so that you don't need to download it. (However you still need a license number from the VMware website.)

These labs have been tested with version 2.0.1 of VMware Server.

Run the VMware Installer

Accept the license agreement and all default options during the installation process.

Enter your license information, which is visible at the VMware website on the same page where you downloaded the software.

Choose Manage Virtual Networks from the start menu. After the program starts, make sure that you see an "Apply" button at the bottom. If you do not see an "Apply" button then close the program and re-start it by right-clicking and choosing to "run as administrator" (this must be done on Windows 7 normally).

Click the Host Virtual Network Mapping Tab and then click the Right Arrow Button next to VMnet1. Choose Subnet from the submenu.

Set the IP address to 172.16.100.0 and click OK.

Click the Right Arrow Button next to Vmnet8 and choose Subnet from the submenu.

Set the IP address to 192.168.78.0 and click OK.

Click the APPLY button.

Return to the Summary tab and VALIDATE:VMnet1 has subnet 172.16.100.0VMnet8 has subnet 192.168.78.0

Go to the NAT tab and VALIDATE that the VMnet host is VMnet8 and Gateway IP is 192.168.78.2

If you are at an event, then the event organizers might provide a special DEMO option - where you can run a pre-configured RAC cluster on your own laptop. In order to use this DEMO option, follow this lab but use the directories on the event-provided external hard drive.

RAC Attack is carefully designed to use three directories and spread out I/O for the best possible responsiveness during labs. Create these three directories in the destinations that you chose in Hardware and Windows Minimum Requirements, taking the guidelines into consideration.

mkdir C:\RAC11g
mkdir D:\RAC11g-shared
mkdir D:\RAC11g-iso

In the RAC11g directory, make sure that collabn1 and collabn2 subdirectories don't exist.

rmdir C:\RAC11g\collabn1
rmdir C:\RAC11g\collabn2

The VMware Server management interface is web-based, and some new web browsers are not compatible with it. There are two ways to open this management interface:

If you are at an event, then the event organizers might have provided Firefox 2.0.0.20 which has been tested with RAC Attack. You can run this browser directly from the Jumpstart Drive without installing it on your PC. This version of firefox can also be downloaded from the internet.

Launch VMware Server Home Page from the start menu. This will use your default web browser.

Depending on what web browser you use, you might receive security-related warnings. Proceed through all of these warnings and choose to view the web page.

The warning in Mozilla Firefox

The same alert in Internet Explorer 6

Login to the VMware console with the local windows admin account username and password.

On the main screen (Summary tab), find the Commands box and choose Add Datastore.

Repeat this step three times. Set the datastore names to RAC11g, RAC11g-shared and RAC11g-iso. Choose Local Datastore and use the directory path which you previously chose and created.

VERIFY that the three new datastores exist in the Summary screen – named RAC11g and RAC11g-iso and RAC11g-shared. Also VERIFY that the two networks vmnet1 and vmnet8 are available as HostOnly and NAT respectively.

Enter your Name, Company, Email and Country and review/accept the license and export restrictions before clicking Continue. If you have visited Oracle EDelivery before then make sure to enter your information exactly the same.

If this is the first time you've downloaded software from Oracle, then you might have to wait a few days until you receive an email from Oracle granting you permission to continue.

Open the section called Virtual Device Node and choose IDE 0:0. Then click Next.

Carefully follow this step because it's easy to miss.

Click Finish to add the device. Don't power on the virtual machine yet.

If you are in a class, then the instructor may have provided a second virtual DVD named RAC11gR2.iso to save some class time. It contains all additional software downloads.

Repeat all previous steps from this lab to add the second DVD using RAC11gR2.iso image and choosing IDE 0:1.

If you are not in a class, then you will later download all needed software and build the second DVD yourself.

Continue below.

Scroll down to the Hardware box and confirm the Virtual Machine settings. They should match this picture (except that you should only see the second DVD if you are in a class and it was provided by the instructor):

Click the Console tab. You might see a message saying that the Remote Console Plug-in is not installed. If you see this message then click Install plug-in and follow the directions before continuing. (Note: you may be asked to restart your computer during this process.)

When the plugin is installed, you should see a large “play” button in the center of the console. Click on the play button to start the VM.

When you see the square boxes, click anywhere to open a console window.

A new window will now open - outside of your web browser. If you opened this window soon after starting the Virtual Machine, then you will see the boot screen of the Oracle Enterprise Linux installer.

At first, this new console window will ignore your keyboard and mouse. Click inside the new console window and it will begin accepting your keyboard and mouse.

Anytime your keyboard and mouse are stuck in the VMware Virtual Machine, you can press CTRL and ALT together to move them outside the VM.

If you still see the boot screen then you may press enter to continue, or just wait for it to automatically continue.

Choose to SKIP the media test.

Choose NEXT when the first installer screen comes up.

Accept the default English language and choose Next.

Choose US English keyboard layout and click Next.

Select YES to initialize the drive.

Accept the default layout (with no encryption) and choose NEXT.

Choose YES to remove all partitions.

Set the hostname to collabn1.vm.ardentperf.com and leave DHCP enabled before choosing NEXT.

Choose the timezone where you are located! Let the system clock run on UTC though.

Set the root password to racattack

Choose Customize Now – but don't choose any "additional tasks". Then click NEXT.

Select only these package groups, then click NEXT to continue:

Category

Selections

Desktop Environments

Gnome Desktop Environment

Applications

Editors
Graphical Internet
Text-based Internet

Development

Development Libraries
Development Tools

Servers

Server Configuration Tools

Base System

Administration Tools
Base
System Tools
X Window System

Do not choose Cluster Storage or Clustering.

Choose NEXT to start the installation.

Choose REBOOT when the installation is complete.

After the machine reboots – when you wee the Welcome screen – choose FORWARD.

Tip: If you are familiar with the unix command-line, then we recommend connecting through SSH in addition to using the VMware console. You can then copy-and-paste many commands from this handbook! Until we configure networking, VMware will assign the address 192.168.78.128.

Login as the user root with password racattack.

GNOME is the the graphical window environment installed by default in OEL. First, disable GNOME CD automount. Go to the menu System >> Preferences >> Removable Drives and Media.

Uncheck all of the options under Removable Storage and click Close.

Open a terminal window from the menu Applications >> Accessories >> Terminal.

From the menus, open Edit >> Current Profile.

In the Title and Command tab, check the box for Run command as a login shell, then close the dialog.

The editor "gedit" is a simple graphical editor – similar to notepad – and it can be used to edit files on Linux. If you are going to use gedit, then it is helpful if you open Edit > Preferences to disable text wrapping and enable line numbers.

In a terminal window as the root user, shutdown and disable anacron then run it manually with no delay.

It should not cause any problems for you, but be aware that several CPU and I/O intensive jobs will run in the background for about 10 minutes while you continue with this lab (e.g. updatedb and makewhatis). You might notice some slight system performance degradation. You can always use the program top to see what is currently running.

In a terminal window as the root user, shutdown and disable the automounter.

If any of the small CD images in the status bar do not have a green dot, then click on the CD image and choose "Connect to [RAC11g] iso/... on Server". If a window opens showing the CD contents then make sure to close the window.

Create two CDROM directories named cdrom and cdrom5.

Make sure to use these names because many later steps in this handbook will reference them!

Return to the Summary tab in the VMware console. From the Status box, choose to Install VMware Tools. Click the Install button to begin.

Install VMware client tools and run configuration tool.

You must perform this step in the VMware Console; do not use PuTTY or any other terminal program.

[root@collabn1 mnt]# mount /mnt/cdrom
mount: block device /dev/cdrom-hda is write-protected, mounting read-only
[root@collabn1 mnt]# rpm -ivh /mnt/cdrom/VMwareTools-7.7.5-156745.i386.rpm
Preparing... ########################################### [100%]
1:VMwareTools ########################################### [100%]
The installation of VMware Tools 7.7.5 for Linux completed successfully.
You can decide to remove this software from your system at any time by
invoking the following command: "rpm -e VMwareTools".
Before running VMware Tools for the first time, you need to
configure it for your running kernel by invoking the
following command: "/usr/bin/vmware-config-tools.pl".
Enjoy,
--the VMware team
[root@collabn1 cdrom]# vmware-config-tools.pl

...

Choose NO to skip the VMware FileSystem Sync Driver (vmsync)

Choose display size [12] – 1024x768

Mounting HGFS shares will probably FAIL, but this is ok.

Run the network commands. (You can cut and paste the commands into the terminal.) Next, run vmware-toolbox and enable clock synchronization.

If you have already downloaded any of these files, you may optionally copy them to the /tmp directory in your virtual machine. When you create the DVD, any remaining files will be automatically downloaded.

Create the DVD by running the automatic build script. You will be prompted for your Oracle SSO login and password.

If your account is not authorized for Oracle Support then patch downloads will fail.

Close and re-open your terminal sessions so that the new profiles take effect.

Install fix_cssd script.

In VMware test environments you usually have a very small amount of memory. Oracle CSS processes can take up a *LOT* of the memory (over 50% in this lab) because it locks several hundred MB in physical memory. In VMware (for both ASM and RAC environments) this may be undesirable. This low-level hack will make the memory swappable at runtime.

NEVER, EVER, EVER EVEN IN YOUR WILDEST DREAMS THINK ABOUT TRYING THIS ON ANYTHING CLOSE TO A PRODUCTION SYSTEM.

In the Inventory tab at the left, select collabn1 (the virtual machine we just created).

From the Commands box, click Add Hardware. In the window that appears, click Hard Disk.

Choose to Create a New Virtual Disk and click Next.

Enter a capacity of 3.25 GB and type the name “[RAC11g-shared] data.vmdk”.

Choose File Options → Allocate all disk space now.

Choose Disk Mode → Independent and Persistent.

Choose Virtual Device Node → SCSI 1:0. Click Next to continue.

Click Finish to create the disk.

It may take a moment for the disk to appear to the VMware console. Wait until the new disk appears before you continue with the lab. Furthermore, the web browser may display an error which requires you to reload the page and login to VMware again.

Repeat steps 1-5 for the second disk (it is listed at the beginning of this lab).

CONFIRM that your list of hard disks and network devices matches this screenshot.

From the Commands box, click Configure VM.

Click the Advanced tab and scroll down to the Configuration Parameters. Use the Add New Entry button to add the entries listed here. Click OK to save the configuration changes.

Name

Value

disk.locking

false

diskLib.dataCacheMaxSize

0

diskLib.maxUnsyncedWrites

0

mainMem.useNamedFile

false

I have found the following three websites among the most useful while creating custom VMware configurations. They show how powerful and versatile VMware is – even the free VMware Server product.

As root, restart the network services by typing service network restart. Then confirm the new ip addresses with ifconfig. Also confirm the search domain by inspecting /etc/resolv.conf – if the file has reverted then edit it again. (When I wrote this lab, the change stuck after the second time I edited the file.)

You must perform this step in VMware; do not use PuTTY.

Edit /etc/ hosts. EDIT the line with 127.0.0.1 and then ADD all of the other lines below:

As root, restart the network services by typing service network restart. Then confirm the new ip addresses with ifconfig. Confirm that search domain by inspecting /etc/resolv.conf – if the file has reverted then edit it again. (The change stuck after the second time I edited the file while walking through this lab.) Also confirm the new hostname with hostname.

You must perform this step in VMware; do not use PuTTY.

Exit your terminal session and start a new one so that you can see the updated hostname in the prompt.

Edit /etc/hosts. EDIT the line with 127.0.0.1 and then ADD all of the other lines below:

As the oracle user, launch the grid installer. At the first screen, choose Install and Configure Grid Infrastructure for a Cluster and click NEXT.

[oracle@collabn1 ~]$ /mnt/cdrom*/grid/runInstaller

Choose Advanced Installation and click NEXT.

Accept the default language (English) and choose NEXT.

Name the cluster collab and make sure that the SCAN name is collab-scan with port 1521, then click NEXT.

Add node collabn2 with VIP collabn2-vip and choose NEXT to validate the cluster configuration.

Verify that eth0 on subnet 192.168.78.0 is PUBLIC and that eth1 on subnet 172.16.100.0 is PRIVATE, then click NEXT.

Choose to store the Clusterware Files in ASM and choose NEXT.

Create a diskgroup called DATA with External Redundancy using only the disk ORCL:DATA and click NEXT.

Choose to use the same passwords for all accounts and enter the password racattack, then click NEXT. (Ignore the message that Oracle doesn't like this password.)

Do not use IPMI. Click NEXT.

Set the OSDBA group to asmdba, the OSOPER group to asmoper and the OSASM group to asmadmin. Then click NEXT.

Accept the ORACLE_BASE location of /u01/app/oracle and use the ORACLE_HOME location of /u01/grid/oracle/product/11.2.0/grid_1. Then click NEXT.

Accept the default inventory location of /u01/app/oraInventory and choose NEXT

The prerequisite checks will execute. A warning will be issued saying that three checks failed: physical memory, swap size and network time protocol. Click the CHECK BOX to Ignore All, then click NEXT.

SAVE a response file called grid.rsp in the oracle user home directory. Then click FINISH to install grid infrastructure.

When prompted, open a terminal as the root user and run the two root.sh scripts. Make sure to run BOTH SCRIPTS on BOTH NODES!

Before you run any scripts on the second node, check the CPU utilization on the first node - where you just finished running scripts. If %idle is 0 then something is still running in the background and you should wait until %idle increases. You can monitor the CPU with any of these three commands:

After running both scripts, return to the installer window and click OK to continue running configuration assistants.

The Cluster Verification Utility will fail because NTP is not running. If you want to, check the error message at the very end of the logfile. Then click OK to close the messagebox and click NEXT to continue.

You should now see the final screen! Click CLOSE to exit the installer.

These steps are not necessary for a test or production environment. However they might make your VMware test cluster just a little more stable and they will provide a good learning opportunity about Grid Infrastructure.

Grid Infrastructure must be running on only one node to change these settings. Shutdown the clusterware on collabn2 as user root.

Choose CONFIGURE NODES… from the CLUSTER menu. If you see a notification that the cluster has been started, then acknowledge it by clicking the Close button.

Click ADD and enter the collabn1 and the private IP 172.16.100.51. Accept the default port. Click OK to save.

Click ADD a second time and enter collabn2 and 172.16.100.52. Then choose to APPLY then click CLOSE to close the window.

Choose PROPAGATE CONFIGURATION… from the CLUSTER menu. If you are prompted to accept host keys then type YES. Type the root password racattack at the both prompts. When you see the message “Finished!” then press <ALT-C> to close the window.

From the TASKS menu, choose FORMAT to create the OCFS filesystem. Select /dev/sdb1 and type the volume label u51-data. Leave the rest of the options at their defaults and click OK to format the volume. Confirm by clicking YES.

Repeat the previous step for volume /dev/sdc1 and name it u52-backup.

Exit the OCFS2 console by selecting QUIT from the FILE menu.

Configure OCFS2 on both nodes. We will use a conservative disk heartbeat timeout (300 seconds) because VMware is slow on some laptops.

Several NFS appliances and big-iron cluster filesystems are very common in large cluster database deployments. We will use OCFS2 here to practice 11gR2 RAC with a filesystem.

Note: 11gR2 clusterware has a bug – it does allow cluster files on OCFS2 (even though this is a supported configuration). To work around this bug, we will present the OCFS2 directory to clusterware with a local "loopback" NFS mount.

As the root user, Follow the steps below to setup the local NFS mount on node collabn1.

Several NFS appliances and big-iron cluster filesystems are very common in large cluster database deployments. We will use OCFS2 here to practice 11gR2 RAC with a filesystem.

As the oracle user, launch the grid installer. At the first screen, choose Install and Configure Grid Infrastructure for a Cluster and click NEXT.

[oracle@collabn1 ~]$ /mnt/cdrom*/grid/runInstaller

Choose Advanced Installation and click NEXT.

Accept the default language (English) and choose NEXT.

Name the cluster collab and make sure that the SCAN name is collab-scan with port 1521, then click NEXT.

Add node collabn2 with VIP collabn2-vip and choose NEXT to validate the cluster configuration.

Verify that eth0 on subnet 192.168.78.0 is PUBLIC and that eth1 on subnet 172.16.100.0 is PRIVATE, then click NEXT.

Choose to store the Clusterware Files in Shared File System and choose NEXT.

For the OCR, choose External Redundancy and type the path /u61/cluster/ocr. (This is the NFS location from the BUG WORKAROUND.) Click NEXT to continue.

For the Voting Disk, do the same – choose External Redundancy and type the path /u61/cluster/vdsk. (Again, this is the NFS location from the BUG WORKAROUND.) Click NEXT to continue.

Do not use IPMI. Click NEXT.

Set the OSDBA group to asmdba, the OSOPER group to asmoper and the OSASM group to asmadmin. Then click NEXT.

Accept the ORACLE_BASE location of /u01/app/oracle and use the ORACLE_HOME location of /u01/grid/oracle/product/11.2.0/grid_1. Then click NEXT.

Accept the default inventory location and choose NEXT

The prerequisite checks will execute. A warning will be issued saying that three checks failed: physical memory, swap size and network time protocol. Click the CHECK BOX to Ignore All, then click NEXT.

SAVE a response file called grid.rsp in the oracle user home directory. Then click FINISH to install grid infrastructure.

When prompted, open a terminal as the root user and run the two root.sh scripts. Make sure to run BOTH SCRIPTS on BOTH NODES!

[root@collabn1 ~]# ssh collabn2
root@collabn2's password: racattack
-bash: oraenv: No such file or directory
[root@collabn2 ~]# /u01/app/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.
Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.
[root@collabn2 ~]# /u01/grid/oracle/product/11.2.0/grid_1/root.sh
Running Oracle 11g root.sh script...
The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /u01/grid/oracle/product/11.2.0/grid_1
Enter the full pathname of the local bin directory: [/usr/local/bin]: /usr/bin
Copying dbhome to /usr/bin ...
Copying oraenv to /usr/bin ...
Copying coraenv to /usr/bin ...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2011-03-30 17:04:26: Parsing the host name
2011-03-30 17:04:26: Checking for super user privileges
2011-03-30 17:04:26: User has super user privileges
Using configuration parameter file: /u01/grid/oracle/product/11.2.0/grid_1/crs/install/crsconfig_params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node collabn1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
CRS-2672: Attempting to start 'ora.mdnsd' on 'collabn2'
CRS-2676: Start of 'ora.mdnsd' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'collabn2'
CRS-2676: Start of 'ora.gipcd' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'collabn2'
CRS-2676: Start of 'ora.gpnpd' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'collabn2'
CRS-2676: Start of 'ora.cssdmonitor' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'collabn2'
CRS-2672: Attempting to start 'ora.diskmon' on 'collabn2'
CRS-2676: Start of 'ora.diskmon' on 'collabn2' succeeded
CRS-2676: Start of 'ora.cssd' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'collabn2'
CRS-2676: Start of 'ora.ctssd' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'collabn2'
CRS-2676: Start of 'ora.drivers.acfs' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'collabn2'
CRS-2676: Start of 'ora.asm' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'collabn2'
CRS-2676: Start of 'ora.crsd' on 'collabn2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'collabn2'
CRS-2676: Start of 'ora.evmd' on 'collabn2' succeeded
collabn2 2011/03/30 17:12:32 /u01/grid/oracle/product/11.2.0/grid_1/cdata/collabn2/backup_20110330_171232.olr
Preparing packages for installation...
cvuqdisk-1.0.7-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB. Actual 1205 MB Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'UpdateNodeList' was successful.

After running both scripts, return to the installer window and click OK to continue running configuration assistants.

The Cluster Verification Utility will fail because NTP is not running. If you want to, check the error message at the very end of the logfile. Then click OK to close the messagebox and click NEXT to continue.

You should now see the final screen! Click CLOSE to exit the installer.

These steps are not necessary for a test or production environment. However they might make your VMware test cluster just a little more stable and they will provide a good learning opportunity about Grid Infrastructure.

Grid Infrastructure must be running on only one node to change these settings. Shutdown the clusterware on collabn2 as user root.

Login to collabn1 as the oracle user and open a terminal. Run CLUVFY to check that you're ready to start the DB install. The memory, swap and NTP/time checks may fail but everything else should succeed.

Confirm that the ORACLE_BASE is /u01/app/oracle and change the ORACLE_HOME to /u01/app/oracle/product/11.2.0/db_1. Click NEXT to continue.

Verify that the OSDBA group is dba and the OSOPER group is oper. Click NEXT to continue.

The prerequisite checks will execute. A warning will be issued saying that three checks failed: physical memory, swap size and network time protocol. Click the CHECK BOX to Ignore All, then click NEXT.

SAVE a response file called db.rsp in the oracle user home directory. Then click FINISH to install the oracle database software.

When prompted, open a terminal as the root user and run the root.sh script. Enter /usr/bin as the local bin directory and overwrite the files which were previously installed by grid infrastructure. Make sure to run it on BOTH NODES!

ASM Databases Only: Verify that both diskgroups are mounted. If you have jumpstarted or rebooted, then the BACKUP diskgroup may be dismounted. To mount it, right click then choose Mount on All Nodes. Click EXIT to close the ASM Configuration Assistant.

Type ". oraenv" to setup the environment. Leave the default SID and enter /u01/app/oracle/product/11.2.0/db_1 for the ORACLE_HOME. Then type dbca to launch the Database Configuration Assistant.

At the first prompt, choose Real Application Clusters Database and click NEXT.

Choose to CREATE A DATABASE then click NEXT to continue.

Select GENERAL PURPOSE OR TRANSACTION PROCESSING then click NEXT to continue.

Choose Admin-Managed Database, Set the global database name to RAC.vm.ardentperf.com and select all cluster nodes. Then click NEXT to continue.

Do not configure Enterprise Manager (there's probably not enough memory here). Uncheck it and click the Automatic Maintenance Tasks tab.

Disable the automatic maintenance tasks (they can really tax the CPU on your laptop...) After unchecking the box, click NEXT to continue.

Set all passwords to racattack and click NEXT to continue. Choose YES to continue even though Oracle doesn't like the password.

Choose a Storage Type depending on which track of the RAC Attack lab you're doing.

Oracle ASM

Shared Filesystem

Choose a Storage Type of Automatic Storage Management (ASM).

Choose a Storage Type of Cluster File System.

Configure Oracle Managed Files.

Oracle ASM

Shared Filesystem

Choose ORACLE MANAGED FILES and type +DATA for the database area. Then click NEXT to continue.

Choose ORACLE MANAGED FILES and type /u51/oradata for the database area. Then click NEXT to continue.

Configure a Flash Recovery Area.

Oracle ASM

Shared Filesystem

Choose to SPECIFY FLASH RECOVERY AREA and type +BACKUP as the destination. Increase the size to 3000 MBytes. Do not enable archiving and choose NEXT to continue.

Choose to SPECIFY FLASH RECOVERY AREA and type /u52/oradata as the destination. Increase the size to 3000 MBytes. Do not enable archiving and choose NEXT to continue.

Oracle will automatically create a directory tree in the specified location and it will separate files by type and by database.

Choose to install the sample schemas. After checking the box, click NEXT to continue.

Bump the memory target up to 400MB and do not check Automatic Memory Management. Skip the other tabs and click NEXT to continue.

Upgrades to the "base version" are very complicated and always use the full Oracle installer (runInstaller). Major new features are only introduced in new base versions.

Patch Sets are also installed with the full Oracle installer. Historically, each patchset was installed on top of the base version (top row in the illustration) by using runInstaller. However, starting with 11.2.0.2 the patch sets can be installed as a new installation without the base version. It is now recommended to perform Patch Set upgrades "out-of-place" in this manner. Sometimes new features are also included with Patch Sets (for example RAT data collection).

PSUs are installed with opatch. They include security updates and important bug fixes. They are released quarterly and always include the latest CPU.

CPUs are installed with opatch. CPUs include only security updates, and are also released quarterly. They cannot be applied after you have applied any PSU. (Until you upgrade to a new patch set or base version.)

Before performing any installation or upgrade of Oracle, you should always check the Support Status and Known Issues for the release. Metalink note 161818.1 is always the starting point – open this note and review it. Next, follow the link for 11.2.0.X to metalink note 880782.1 and review that note. Finally, follow the link to note 880707.1 and review the known issues with Oracle 11.2.0.1 which is the version we will be using for this lab.

These notes have been saved as HTML files on the virtual DVD provided by the instructor. It is available in your Virtual RAC Nodes at /mnt/cdrom5.

For this lab, the instructor has provided recent PSUs. PSUs and CPUs are collections of one-off patches. One-off patches can only be applied to an Oracle database in a rolling manner if they have been certified for rolling upgrades.

Review the installation instructions. We're going to install three patches and you can find the README files at these locations:

/mnt/cdrom5/patch/psu6-db-12419378/12419378/README.html

/mnt/cdrom5/patch/psu2-gi-9655006/README.html

/mnt/cdrom5/patch/opatch-6880880/README.txt

First we need to update the OPatch utility. Find patch 6880880 on the instructor-provided CDROM and unzip it directly into both the grid home and the database home. Before unzipping the file, backup the existing OPatch programs.

This new version of OPatch requires an "OCM response file" for certain operations. Use the OCM utility to generate this file. We don't want to configure OCM; leave your username blank and confirm that "YES" you don't want to enter any account information.

The automated patch application process will automatically shutdown and restart all database processes on the node. However, we don't want the automatic restart – because we are applying two PSUs (one for grid and one for database). Disable the instance auto-start for node collabn1 and manually shutdown the instance for patch application.

On a production system, all active connections would need to be migrated to the other instance before doing this (for example, with services).

Enable and start the Oracle database instance on node collabn1. After the instance is running, stop and disable the instance on node collabn2. There should be no point at which the database is not running.

Optional: if you want more practice working with patches, then try rolling back the database PSU and then try applying it in automated rolling mode (without local flag) or in the “minimum downtime” mode.

The goal of this lab is to demonstrate Oracle Clusterware’s fencing ability by forcing a configuration that will trigger Oracle Clusterware’s built-in fencing features. With Oracle Clusterware, fencing is handled at the node level by rebooting the non-responsive or failed node. This is similar to the as Shoot The Other Machine In The Head (STOMITH) algorithm, but it’s really a suicide instead of affecting the other machine. There are many good sources for more information online.

Start with a normal, running cluster with the database instances up and running.

Monitor the logfiles for clusterware on each node. On each node, start a new window and run the following command:

We will simulate “unplugging” the network interface by taking one of the private network interfaces down. On the collabn2 node, take the private network interface down by running the following command (as the root user):

[root@collabn2 ~]# ifconfig eth1 down

Alternatively, you can also simulate this by physically taking the HostOnly network adapter offline in VMware.

Following this command, watch the logfiles you began monitoring in step 2 above. You should see errors in those logfiles and eventually (could take a minute or two, literally) you will observe one node reboot itself.

If you used ifconfig to trigger a failure, then the node will rejoin the cluster and the instance should start automatically.

If you used VMware to trigger a failure then the node will not rejoin the cluster.

Which file has the error messages that indicate why the node is not rejoining the cluster?

Is the node that reboots always the same as the node with the failure? Why or why not?

The goal of this lab is to demonstrate Oracle Fast Application Notification (FAN) Callouts. In versions prior to 11g, these were also known as Oracle Clusterware Callouts.

This feature is a relatively little-known capability for Oracle Clusterware to fire a script (or a whole directory full of them) to perform whatever tasks you may want performed when a cluster-wide event happens.

For this exercise, we’ll configure some FAN callout scripts on each node and then trigger various cluster events to see how each one triggers the callout script.

Start with a normal, running cluster with both nodes up and running.

From a shell prompt (logged in as oracle) on each server, navigate to /u01/grid/oracle/product/11.2.0/grid_1/racg/usrco. Create file there called callout1.sh using vi (or your favorite editor). The contents of the file should be this:

Following this command, watch the logfiles you began monitoring in step 2 above. Because we set long timeouts on our test cluster, you might have to wait for a few minutes before you see anything.

You should eventually observe entries noting that the node has failed and shortly following that, you should observe an entry placed in the /tmp/<hostname>_uptime.log file indicating that the node is down.

Note which members run the clusterware callout script. (A surviving member could run commands to notify clients and/or application servers that one of the cluster nodes has died.)

One popular use for clusterware callouts is to notify administrators (possibly via email) that a cluster event has occurred. You may use the arguments to the script (you’ll see the arguments in the logfile we’ve created) to conditionally perform notification as well. For example, you may not want to notify anyone unless a node crashes unexpectedly. By testing some of these arguments, you may be able to send notifications only when desired.

In order to test failover it would be best to connect from a client outside the cluster, so we'll start by downloading and installing Oracle's Basic Instant Client (English-only) and the Instant Client SQLPlus package.

Login to the node collabn1 as user oracle and open a connection to the database as SYSDBA and unlock the SH user account. Also grant DBA access.

Download Oracle's Basic (English-only) Instant Client and Oracle's Instant Client SQLPlus package. The lab instructor may have made them available, or they can also be downloaded from Oracle's website here:

Each archive contains a folder named "instantclient_11_2". Extract this folder (from both archives) into C:\. (In Explorer you can drag-and-drop or you can choose "Extract All" from the File menu.)

Edit c:\windows\system32\drivers\etc\hosts and add IP addresses for the RAC nodes.

Your database connections won't work without this - you can't just create a tnsnames that uses IP addresses. Try it out by doing step 4 a few times in a row before this step. Does step 4 sometimes just hang? Do you know why? We'll explore it more later...

Login to collabn1 as the oracle user. Create a new service svctest with RAC1 as a preferred instance and RAC2 as an available instance. This means that it will normally run on the RAC1 instance but will failover to the RAC2 instance if RAC1 becomes unavailable.

IMPORTANT NOTE: This lab was written for Oracle 11gR1 and the information here is crucial when working with this and older versions. It will demonstrate how failover works and the importance of using proper addresses in TNSNAMES. However, starting with 11gR2 the node VIPs should not be used to connect to the database – the SCAN VIP should always be used instead. The 11gR2 client has this same failover functionality built-in for multiple SCAN VIPs returned on a single DNS entry.

On your local computer edit the TNSNAMES.ORA file used by the Instance Client. Add two entries called CFTEST and CFTEST-NOVIP which connect to the RAC service with no load balancing. Explicitly enable connection failover even though it is already enabled by default anyway. Don't use the VIP's for the second entry (this is wrong but we'll test it to see what happens).

On collabn1 check the number of established connections from the listener to the RAC service. Connect from Windows to CFTEST and CFTEST-NOVIP several times and then check the lsnrctl statistics again. All connections from the Windows machine are attaching to listener on collabn1 but this listener is spreading the connections between both instances.

First look at the number of established connections on node 1. It's ok if they're not all zero.

Second, connect to the database several times in a row and use both service names. You can exit each session after you check how long it takes to connect. All of the sessions should connect quickly. Count the number of times you connect.

Third, check the listener connections on node 1 again. Make sure that the total number of established connections shows an increase by at least the same number of sessions that you connected. (That is, confirm that all of your sessions connected to this node.) There might be more connections; that's ok.

Also, notice how the listener is distributing connections to both instances - even though our client is only connecting to the listener on one node. It doesn't matter how many connections go to each instance; it's ok of you don't see 3 and 3.

On your local computer edit the TNSNAMES.ORA file used by the Instance Client. Add a new entry called SVCTEST which connects to the svctest service and make sure that the connection works. Also check your TAF settings after connecting. (Side note: we did not configure this service with a domain name, but you can't connect to it unless you specify one in the TNSNAMES entry. Try it. Where did this domain name come from?)

Open a SQLPlus session on the database and confirm that there are no sessions for the SH user.

SQL> select inst_id, count(*) from gv$session where username='SH' group by inst_id;
no rows selected

Disable server-side load balancing on both instances by clearing the REMOTE_LISTENER init param and re-registering. Before registering with the listeners, restart them to reset the connection statistics.

Re-enable server-side load balancing on both instances by setting the REMOTE_LISTENER init parameter back to its default (collab-scan:1521) and re-registering. Before registering with the listeners, restart them to reset the connection statistics.

In your other connected SQLPlus session, keep an eye on the balance of connections. At the same time, open a new shell session and run this script which will open 160 connections to the database - but this time it will use the LBTEST connection.

a=160; while [ $a -gt 0 ]; do sqlplus sh/sh@LBTEST & a=$((a-1))done

How were the connections distributed between the database instances during server-side load balancing?

Terminate all of the sqlplus sessions by running these two commands. After you run the second command, press <Ctrl-C> after you start seeing the message "no more job".

On node collabn1 measure the differences between various methods. Run this two or three times to warm up the machines. (Note: subtract 500 from the runtimes reported (in hsecs) to account for time in DBMS_LOCK.SLEEP.)

Startup the PXTEST service and check the status of the job again. Make sure to query the user_schedule_jobs table a few times in a row. (Be patient for at least one minute.) Did the job execute? If so, then on which node?

Our second PL/SQL test will look at the UTL_FILE package. With any PL/SQL operations on RAC you must be aware that the code could execute on any node where its service lives. This could also impact packages like DBMS_PIPE, UTL_MAIL, UTL_HTTP (proxy server source IP rules for example), or even DBMS_RLS (refreshing policies).

Exit SQLPLUS. At the prompt, copy this command to connect to the RAC service as sh again and attempt to read the file you just wrote. Run this command 10-20 times in a row. (Cut-and-paste is recommended.) What happens? Why?

The error occurred on the remote node, but was reported here. It was also recorded on the remote node – do you know where it is recorded? What kind of monitoring would need to be in place to be proactively alerted by messages like this?

Now create the directory on the remote node and re-run the operation. This should succeed but it is still a poor configuration; we will investigate the reasons later in this lab.