Monday, September 26, 2005

Solaris 10 x64 FCS with Sun Fire V40z and 4 3510s (2+2)

Recently I spent some time configuring a Sun Fire V40z (with single core Opterons) and four Sun StorEdge 3510s. The configuration was "custom made" since I had specifically ordered two dual RAID controller 3510s and two JBOD 3510s. I had also ordered two PCI-X Dual 2Gbit FC Network Adapter (X6768A).

The actual configuration is as follows: The PCI-X Dual 2Gbit FC Network Adapter HBA cards went into slot 6 & 7 of V40z. The 3510s configuration was bit tricky in my case. I wanted two set of 1 Dual RAID Controller with 1 expansion JBOD 3510.Hence I attached two (diagonal) ports from the JBOD into the ports (2,3 - similarly diagonal) of the dual controller 3510. And then ports (0) which essentially is my primary controller port connected to one port of the HBA and the port (5) to the other HBA (I purposely selected both HBA). The other set was similarly configured and attached to both the HBAs simultaneously. (My logic for connecting ports to both the HBAs: Even if the HBA dies I still have access to the both the sets (albeit different controller).

I had already installed Solaris 10 03/05 (FCS) on the V40z. After creating 1 huge RAID-5 LUNs assigned to each controller, I finally rebooted Solaris 10 x64 with the "touch /reconfigure" option. This is where I landed into various bugs in Solaris 10 FCS. Now I have done the hard work of identifying bugs and seeking out corresponding patches so for the sake of simplicity for someone who has similar configuration just apply the following patches. I will try to mention the symptom that it solves also for you. (If you don't have this patches already installed then working with Sun StorEdge 3510FC will be a pain on Solaris x64)

119255-06 SunOS 5.10_x86: Install and Patch Utilities Patch

119375-04 SunOS 5.10_x86: sd patch

118997-02 SunOS 5.10_x86: patch usr/sbin/format

119131-09 SunOS 5.10_x86: Sun Fibre Channel Device Drivers

You can download these patches from SunSolve. (HINT: These patches are all included in Solaris 10 x86 recommended and security updates patch cluster, so it makes sense applying the whole cluster which will apply all patches in the right order)

Just to be more complete I will give you symptoms on the lack of these patches. If you don't install 119255 then other patch installs may fail. So that should be the first patch to install. The next patch 119375 fixes the geometry view of the RAID LUNS. (If you ever saw RAID LUNS reported much smaller than what they really are... then this patch solves that problem). 118997 fixes format specially if you see it dumping "core" when >1TB LUNS are created on 3510s. And finally my favorite 119131-09 apart from all other bugs mentioned in the bugreport it solves bug 6261601 high xcall rates from fcp driver kill v40z performance. The symptom of this bug is IO performance difference of your application when it runs on any cpu other than cpu-p0 where mpstat shows high interrupts in the system.

Now I am ready for real Database load on this Sun System with huge amount of Storage.(The commented parameters are useful when I use PostgreSQL on this system and these parameters optimizes File System Buffer Cache for PostgreSQL)

More on this Sun System setup later on another blog topic. Stay tuned..