Oracle Blog

Steve Tunstall's Blog - Steve.Tunstall@Oracle.com

Tuesday Feb 07, 2012

I haven’t given out a real tip for a while now, but
this issue popped up on my last week, so thought I would pass it along. I had a
horrible time setting up a new 7320 cluster; for the sole reason that I screwed
it up by not doing it in the right order. This caused my install,
which should have been done in 1 hour, to take me over 3 hours to complete.

So let me tell you what I did wrong, and then I'll
tell you the way I should have done it.

Out of the box, my client's two new 7320 controller
heads were one software revision behind, at 2010.Q3.4.2, so I wanted to upgrade
them to the newest version of 2011.Q1.1.1. So far, so good, right? Well here
was my mistake. I configured controller A via the serial interface, gave it IP
numbers, went into the BUI, and did the upgrade to 2011.Q1.1.1. No problem.
Now, I wanted to bring the other one up and do the same thing. However, I knew
that controller B in a cluster must be in the initial, factory-reset state
in order to be joined to a cluster. You can't configure it, first, or if
you do, you must factory-reset it in order to join a cluster. So I bring
controller B up, but I don't configure it, and I go to controller A to start
the cluster setup process. Big mistake. The process starts, but because the two
controllers are on two different software versions, the cluster process cannot
continue. This hoses me (that's southern California slang for "messes me
up"), because now controller B has started the cluster setup process, and
going to the serial connection just has it hung up in a "configuring
cluster" state. Rebooting it does not help, as it's still in the
"configuring cluster" state once it comes back up.

So.... now I have 2 choices. I can downgrade controller
A back to 2010.Q3.4.2, or I can factory-reset controller B, bring it up as a
single controller, upgrade it to 2011.Q1.1.1, and then factory reset again, and
then finally be able to add it to the cluster via controller A's cluster setup
process. I opt for the second choice, as I do not want to downgrade controller
A, which is working just fine. Remember, controller B is currently hosed,
messed up, or wanked, depending on how you want to say it.
It's stuck. So to get it back to a state I can work with, I need to do the
trick I talked about way back in this blog on May 31, 2011 (http://blogs.oracle.com/7000tips/entry/how_to_reset_passwords_on).
I had to use the GRUB menu, use the -c trick on the kernel line, and reset the
machine and erase all configuration on it. Now I could bring it up as a single
controller, upgrade it, factory reset it, and then have it join the cluster.
That all worked fine, it just took be two hours to do it all.

Here's what I should have done.

Bring up controller A, config it and log into the
BUI. Now bring up controller B. Do NOT config it in any way. Using controller
A, setup clustering in the cluster menu.

Once the two controllers are clustered and all is
well, NOW go ahead and upgrade controller A to the latest code. Once it
reboots, go ahead and upgrade controller B. Everything's fine. You see, if the
cluster has already been made, it's perfectly fine to upgrade one controller at
a time. The software lets you do that. The software does NOT let you setup a
NEW cluster if the controllers are not on the same software level.

About

This blog is a way for Steve to send out his tips, ideas, links, and general sarcasm. Almost all related to the Oracle 7000, code named ZFSSA, or Amber Road, or Open Storage, or Unified Storage.
You are welcome to contact Steve.Tunstall@Oracle.com with any comments or questions