Build an 11gR2 RAC cluster in VirtualBox in 1 Hour using OVM templates

[I originally posted this over at the Pythian blog. If you're not following it, you should! Way more content, by far smarter people than lil ol' me.]

After reviewing my blog post about running EBS OVM templates in VirtualBox, two of my teammates suggested that I work on something with potentially broader appeal. Their basic message was, "This is really cool for us EBS nerds, but what about the Core DBAs?"

So how does "11gR2 RAC in an hour" sound? In this post, I'll demonstrate how to deploy the pre-built Oracle VM templates to create a two-node 11gR2 RAC cluster in Oracle VirtualBox.

Why do this?

There are already several high-quality "How to run RAC on your workstation" HOW-TO's out there, including the well-known RAC Attack (by Pythian's own Jeremy Schneider, and others) and Tim Hall's super-straightforward article on ORACLE-BASE. Does the internet really need another screenshot-heavy blog post about installing Oracle RAC? Maybe not, but I'm doing it anyway, because:

The OVM templates come with the software pre-installed/patched, and scripts that configure the networking, Grid Infrastructure, and database for you. Less fiddling around reduces the possibility of error, and you still have a RAC cluster at the end!

I claimed in my earlier blog post that it should be possible to convert other OVM templates, so it seemed like a good idea to actually test that claim.

I wanted an excuse to play around a bit more with the command-line interface to VirtualBox.

Some readers might point out that installing and configuring the software is a good way to learn how things work, and that breaking and fixing things along the way helps one learn even more. I actually agree with that sentiment in general, since I'm a "learn by failing doing" kind of guy. On the other hand, Oracle is selling a line of high-end products that are supposed to take all of the hard work out of configuring RAC, so why shouldn't we have a bit of fun?

Ingredients

You will need:

RAM. Lots of RAM. The OVM template docs specify 2GB *per RAC NODE*, and that is probably on the small side for any serious work. If you want to do anything else with your workstation while this is running, you will not be able to proceed without at least 6GB of RAM on the host machine. This is less resource-intensive than building your own OVM server, but it is not a lightweight endeavor.

80-100GB of disk space, depending on how you size your ASM disks

A recent version of VirtualBox. An old one might do, but I didn't test on an old version.

DNS service for the SCAN interface. You might be able to get away without it, but I can't guarantee that the Oracle-supplied cluster build scripts will work if you try to fake it. Tim Hall has a great post on a minimal DNS setup for SCAN, or you can use dnsmasq to convert your local hosts file into a DNS service. I opted for dnsmasq; it's pretty cool.

A Linux install ISO image (or physical CD, if you're into that sort of thing). I used Oracle Enterprise Linux 5, Update 6, but any relatively recent OEL or RHEL install image should do the job here.

An understanding of some basic Linux systems administration tasks

Familiarity with configuring storage and network options in Virtualbox

Important notes and thank-you's

Nothing you're about to read in this post is supported by anyone. Not me, not Pythian, and certainly not Oracle. If you're thinking about using the techniques described here for any sort of production or QA deployment, please stop and question your sanity. Then call a few colleagues over to your desk and ask them to question your sanity.

Please be mindful of your licensing and support status before working with these templates. Content from Oracle's Software Delivery Cloud is subject to a far more restrictive licensing than the more-familiar OTN development license. [Thanks to Don Seiler (@dtseiler) for reminding me of this.]

So far, this is just a proof-of-concept. I haven't done extensive work to validate the RAC cluster I built from these instructions. There may be resource limitations that I have not yet discovered in this system, or more artifacts specific to the Oracle VM template that could be removed. "Do not be too proud of this technological terror you've constructed."

As always, I'm "standing on the shoulders of giants" to make this post happen. Huge thanks to Tim Hall (aka ORACLE-BASE) for his concise how-to documents that served as a springboard for this project; to the creators of dnsmasq for the easy local DNS option; to the clever folks at Oracle who built the VM cluster deployment script; and to my Pythian teammates and a handful of Twitter followers for encouraging me to blog about this.

HOWTO: The short version

The basic steps are as follows, with details in the next section.

Set up your local DNS with IP addresses for both nodes in your future RAC cluster.

HOWTO: The long version

Set up DNS entries for your RAC cluster

Complete details on DNS setup are beyond the scope of this post; instead, I've provided external references above to point you in a good direction. Here are the IPs and hostnames that I will be using in my example deployment. I'm using two separate host-only networks (vboxnet0 and vboxnet1) for the public and private interfaces, and the subnets (192.168.56.x and 192.168.57.x) were chosen automatically for me by Virtualbox. I try to keep things simple.

Unzip the two files you just downloaded (V25916-01.zip and V25917-01.zip). You'll get two .tgz files, OVM_EL5U5_X86_64_11201RAC_PVM-1of2.tgz and OVM_EL5U5_X86_64_11201RAC_PVM-2of2.tgz

Unpack the two zipped tar files (tar zxpf OVM_EL5U5_X86_64_11201RAC_PVM*.tgz). This will create a directory called OVM_EL5U5_X86_64_11201RAC_PVM, and that's where we'll be doing all of our work.

Convert the OVM disk images to VDI format

Open a command/terminal window and use the VBoxManage utility to convert the raw disk images (.img) in OVM_EL5U5_X86_64_11201RAC_PVM to .vdi files. This utility is installed with VirtualBox; you may need to find it first and add it to your path (location varies by host platform). Timings listed in the examples below are provided to set expectations for how long you'll need to wait for the conversion to complete.

Note: I'm running VirtualBox on OS X, and the installer dropped VBoxManage into /usr/bin for me, so it's already in my path. Presumably you'll find a similar situation in Linux. If you're on Windows, and haven't customized your install, you should be able to find VBoxManage.exe in Program Files/Oracle/VirtualBox.

Boot the new VM (Thing1) in rescue mode from the install CD
Enter “linux rescue” at the the boot: prompt to enter rescue mode:

Select the keyboard and language preferences that suit you, and enable two network interfaces: eth0 and eth2 (for now, just select “use IPv4" and “DHCP” when configuring). There is no need to enable eth1, since only one of the host-only interface needs to be active for this exercise:

(repeat the steps above for the NAT interface, eth2)
After setting up the network interfaces, progress therough the menus (“Continue” and “OK” in my case) until you get to a linux prompt, and switch to the root volume as instructed: # chroot /mnt/sysimageOptional step: start the sshd service and connect to the VM from your host via ssh, instead of performing the next few steps from the console of the VM. Use ‘ifconfig eth0' to find the IP address to use. (Note: the root password for both VMs is ‘ovsroot’)# service sshd start

Update a few configuration files

The kernel modules that are loaded to support the Xen kernel are not going to work with the non-Xen kernel, so we need to update modprobe.conf to match our target kernel version:

This VM is configured with a Xen version of the Oracle Linux 5.5 kernel, so we need to grab a "vanilla" version of that kernel. We'll use the Oracle public yum server to accomplish this; that’s why we’ve configured and activated the NAT interface. Since you've set up your host to act as a DNS server already, you should not need to add a nameserver entry to resolv.conf. In my case, the VM was able to resolve the address for public-yum.oracle.com without any further configuration changes. If you have issues, try replacing the "nameserver" line in /etc/resolv.conf with "nameserver 8.8.8.8"

Finally, add "divider=10" to the boot parameters in grub.conf to improve VM performance. This is often recommended as a way to reduce host CPU utilization when a VM is idle, but it also improves overall guest performance. When I tried my first run-through of this process without this parameter enabled, the cluster configuration script bogged down terribly, and failed midway through creating the database.

Reboot and install VirtualBox Guest Additions
This is another recommended item to improve performance in your running VMs.

From the Devices menu, remove the Linux boot iso from the DVD drive

Reboot the VM by typing 'exit' twice in the console

Log in to the VM as root (password ovsroot)

Select "Install Guest Additions" from the Devices menu

Mount the Guest Additions media (attached to /dev/cdrom) from the console of the guest

Execute the VBoxLinuxAdditions.run script (add the --nox11 option, since we're not using X at this point)

After the Guest additions are installed (see screenshot below), shut down the guest (shutdown -h now)

Clone the first VM to a second, and add the shared disks

Big shout-out to Tim Hall here; apart from some minor variations, these are essentially his steps for adding shared disks to the cluster VMs. The primary difference is that we can use the clonevm functionality of VBoxManage, because we aren't attaching the (unformatted) shared disks until afterward. I've partially scripted these steps, to reduce the amount of copy/paste required.

Please note that these commands assume a Unix-like shell. If you're running Windows and using a normal command shell (instead of something like Cygwin), you are no doubt a) exceedingly brave, b) very smart, and c) able to translate these commands to fit your environment.

You can run VBoxManage showvminfo $VMNAME1 and VBoxManage showvminfo $VMNAME2 to display configurations of both machines; they should match, and you should see new disks attached to the SATA controller:

Finally, we get to build the RAC cluster!
Start both VMs, and log in as root. Right now, neither machine has an "identity" on the network, so we need to configure them. On each node, execute the script /u01/racovm/netconfig.sh. These need to be run at the same time; the first node will wait until you've started the script on the second node. Answer 'YES' to "Is this the first node" question on node 1, 'NO' on node 2:
Complete the network configuration "interview" on node1 to configure, and wait for changes to be propagated to node 2:

At this point, both machines should be reachable from your host, via the hostnames and IPs you configured earlier. Now we can run the script that builds the RAC cluster and creates a small database called (what else?) ORCL. Connect to Thing1 (your first node) as root and run the script /u01/racovm/buildcluster.sh. I prefer to do this from an ssh session so I can keep track of the output, but you can just as easily run the script from the console. The build script took about 30 minutes on my laptop; your experience will probably be different.

...And we're done!
Since this is still in the proof-of-concept phase for me, I haven't taken this cluster for a long test drive yet, just brief tours through asmctl and the ORCL database. As far as I can see, I have a healthy, happy RAC cluster on my workstation, but I welcome hearing about your experiences and any tweaks or optimizations. Please start a conversation in the comments!