Oracle Blog

Steve Peng's Weblog

Tuesday Jun 14, 2005

Disk relocation in SVM (Solaris Volume Manager)

Blogging probably is the best place that I can share my experience with SVM and
what I know about SVM internals. My name is Steve Peng and I have been with Sun
since 1994 and came from BigBlue when they downsized the AIX development division.
During my 11 years at Sun, I spend most of that time working on the Solaris Volume
Manager (was called Solstice Disksuite before its integration into Solaris in
Solaris 9) and being a key developer on most of cool projects such as 64 bit
Solaris support, disk relocation support, 64 bit SVM and import of named disksets.
In this debut blogging, I like to talk a bit about the disk relocation support.

So what happens if user uses old SDS (Solstice Disksuite) releases and moves
disks around such as recabling the disks? The best thing he/she can do is to
pray for the devts and ctds names to remain the same and if that is not the case
then they are dead. Why? In the old SDS releases, it is the driver name and
minor number along with the device name which is stored in the private
configuration database and used to bring up its configuration during the system
reboot. When disk is moved around and has a different devt and name as a result
of movement then SDS simply can not locate the disk and will fail to bring up
any existing configuration. This lack of ability to relocate the disks can result
in a catastrophic situation if a wrong disk is located and configured.

When SDS is integrated into Solaris 9, the disk relocation support is put in by
storing the unique disk device id such as WWN into the private databse. Now when
a configuration is booted up, those stored unique device ids are used to locate
the disks instead of using the stored devt/name tuples. This cool feature
actually boosts the flexibility of SVM and makes the upgrade story even greater.
If you know the upgrade story on the old SDS releases then you probably know what
I meant when I say 'great'.

So one may wonder how the devt and ctds name of a disk device can change when
the disk is moved around. When a disk is moved from one controller to another,
the device instance number can change and since the disk now is attached to
the new controller the device name will also change. One thing will not change
is the disk unique device id such as WWN. So, exactly how SVM addresses this
disk relocation issue? Let's use the simple stripe as example to see how SVM
attack this issue internally. Says user creates a simple stripe d1 on top of
/dev/dsk/c1t2d0s0 and when d1 is created the following database records will
be created to store its configuration. Dump of the database shows the following
configuration information:

As you can see that RecId 7 is used to describe the stripe d1 that jsut created.
You can see that stripe d1 has only one row and one component and the un_key
is used to locate the underlying device. In this case the '1' is used to locate
its component by scanning the "NM" record for an entry matches the value of 1.
In this case, it is c1t2d0s0. When an entry is located, its stored minor number
will then be used to construct the devt for the component and the devt then will
be used to access the device.

The above information is sufficient enough to bring up stripe d1 as long as
minor number is not changed. As mentioned above, the minor number can change
whenever the underlying disk is moved around. So you can see that some kind of
persistent information needs to be stored to resolve the disk relocation
problem. The approach that taken by SVM is to use the disk's unique device
id (WWN). Whenever a SVM device is created, the device ids of all the
underlying disk components are stored in its database also. When a metadevice is
snarf'd, the stored device id is used instead of traditional devt/name to
locate all the component devices and this gurantee the snarf operation.
The stored unique device id will have information looks like this: