Virtualization with load-balancing and hot-failover: Done.

February 19, 2008 by Michael

This is really going the last post of my series on Oracle VM Server / VM Manager on inexpensive hardware.

Last week a second Dell Power Edge arrived, followed by a little NAS/iSCSI System, the ES-2100 from Eurostor, which is rebranded Thecus N5200 Pro. I do link Eurostor because i made some very nice contact with their tech support.

After running the Oracle VM Server on a 2.4 GHz Core 2 Duo Xeon with 4 Gb Ram for about 70 days non-stop, we decided to do the next step: Incarnating a second server with a shared storage.

The one server runs an paravirtualized OEL5 with 2 GB Ram which itself runs an Oracle 11g test instance with medium load, a hardware virtualized Windows XP with 512 MB Ram that runs a Jetty with a few services and since 2 weeks a hvm Debian that serves as a mailrelay for that Exchange of ours… Which has a now really less load as SpamAssasin takes care of all.

Setting up the second Dell was flawless, nothing new.

The iSCSI was another thing… First i deleted the RAID6 as we decided to go for RAID5. Stupid me set disk usage to 100%, went for the weekend, came back on monday and saw: Wow, no space for the iSCSI target. Damn it, all timeplans went bazoo… So deleting the RAID once again and back to start, this time with 20% for Disk Usage (you never know) and 80% for one iSCSI target (if this was my machine, i really had a purpose for 1.5TB storage… but here.. *sigh*).

So, another 8 hours later, i bought a cheap 8 port Gigabit switch, set up the ES-2100 for link aggregation and connected it to both Oracle VM Servers.

I roughly followed the steps described here, but as i changed some steps, let me describe them:

The ocfs2 cluster configuration is as simple as described in the linked Oracle document. I recommend adding names and ipaddresses corresponding to the one in /etc /ocfs2/cluster.conf to /etc /hosts, as the o2cb services won’t start otherwise. One thing Oracle forgot to mention is to open port 7777 on both machines in the iptables configuration.

At first i made the mistake to mkfs.ocfs2 the device and forgot to create a partition. This worked for whatever reason, but i destroyed the filesystem and created a partition with fdisk (new partition, primary, the whole thing).

Next, i didn’t follow Oracle but decided the following:

Unmount the /OVS on the first server (the one with all the vms)

Adding the following stanza to /etc /fstab:

/dev/sdb1 /OVS ocfs2 defaults 10

mount -a

Mount the old /OVS to somewhere else and rsychned it to the new location. I reached transferrates around 20MB/s with concurrent writes from the other server. Not but for a inexpensive device like that little iscsi thingy.

Added the the new server to the pool.

Rebooted both servers, just to be sure that they come back healty.

Restarted all vms, which worked greated over the iscsi.

Tested load-balancing and live migration and what can i say: Wow, it works. Fast and flawless. Great thing.

So in the end we have a safe setup with hardware costs under 5k € and a setup time from about 6 or 7 days which brought some good knowledge and know-how. I think we wouldn’t have achivied this based on a VMWare solution brought by external consultants. Maybe that would have ended like the last Dilbert in that series ;).

To bring some variety to this blog, here’s a picture of the current setup:

I one of the google visitors or the 2 readers have any questions, feel free to ask 🙂