Using a simple web calculator of Hex-to-Text (example: Use this), we can see that 7S1PW?Bym7B is translated to
37533150573f42796d3742 , which represents the last 22 characters of the reported Word 83. I assume that the leftmost nine hex characters represent the storage device. So, easy to identify.

An additional nice trick is to ask the NetApp to represent the LUN Serial in hex:

which represents the same Word 83 we’ve seen before. However, NetApp will not allow you to set (under priv mode) the LUN Serial directly to a hex value. There comes the importance of the Hex-to-Text calculation tools.

Following some unknown problems, I had recently several XenServer machines (different clusters, different sites and customers, and even different versions) with a VDI-END-of-File issues. It means that while you can start the VM correctly, perform XenMotion to another server you are unable to do any storage-migration task – neither Storage XenMotion, nor VDI copy or VM-move commands. In some cases, snapshots taken from the “ill” disks were misbehaving just the same. This is rather frustrating, because the way to solve it is by cloning the disk into a new one, and your hands are bound.

A method I have devised for the task is rather simple – Create a new VDI (on the target storage), map the original VDI and the new VDI to a domain0 machine, and copy using the ‘dd’ command, block-by-block. This is slow, thick, but it’s working.

How to do it? The steps, in general are:

Create a new VDI of the same size or larger than the original VDI.

Map the old and new VDIs UUID

Map the UUID of the control domain you intend to use for this task (it has to be which has access to both VDIs)

Turn off the ‘ill’ VM, mark the ‘ill’ VDI in a way you will be able to identify it easily (unique name label, for example), and unmap it from the VM

Create VBD for the VDI devices for the control domain, and plug them

Create Linux device file for the VBDs on the control domain

Perform ‘dd’ between the old and new disks (do not get confused with the direction, or you will overwrite your data!)

Unmap VBDs, destroy VBDs

Map the new VDI to the VM

Start the VM

I won’t go over the how to create a VDI. Use the XenCenter GUI to do it. Place it on the desired SR. Give it a noticeable name, so you would be able to recognise it

Get the UUID of the new VDI: xe vdi-list name-label=”The name label I used” | grep ^uuid | awk ‘{print $NF}’
Do the same to the source VDI. Use it’s name label, or use xe vbd-list to obtain its VDI UUID

Get the UUID of the control domain you want to use: xe vm-list is-control-domain=true

Unmap the VM’s VDI from it (after setting some very noticeable name for it, and noting the disk number/ID it had on the VM)

On the control domain, run:xe vbd-create vdi-uuid=<’Ill’ VDI UUID> vm-uuid=<Control domain UUID> device=xvdaThis command will result in a UUID. Note this UUID, as the source device UUID.

Run again for the target VDI. This time, use device=xvdb

Note this UUID as well. This is the target UUID.

We need to connect the VBDs and create a device node for them:xe vbd-plug uuid=<UUID of source VBD created above>

There is a new block device available to the XenServer host’s control domain. To identify the new device, we need to run now:tail -1 /proc/partitions
The resulting line would look something like this:

253 10 40960000 tdk

The interesting fields are the first, the 2nd and the last. We will use them to create a block device using the command ‘mknod’:

mknod /dev/tdk b 253 10

The result will be a block device file called /dev/tdk with the major 253 and minor 10.

We will repeat the process for the target VBD, and here we have two additional disks on the control domain.

We can (and should) copy using dd from the source to the target (don’t mix it!). Assuming /dev/tdk is the source, and /dev/tdl is the target, it would look like this:

dd if=/dev/tdk of=/dev/tdl bs=1M oflag=direct

We are using oflag=direct to enforce direct writes and not to saturate the control domain’s caches.

Following the operation, to release the disks and get back to business, we do:

xe vbd-unplug uuid=<SOURCE VBD UUID>

xe vbd-destroy uuid=<SOURCE VBD UUID>

xe vbd-unplug uuid=<TARGET VBD UUID>

xe vbd-destroy uuid=<TARGET VBD UUID>

Map the new disk to the VM, to the correct device number

Start the VM

If it starts OK, we can destroy the old VDI and have a bowl. If it doesn’t, we can always map the previous (source) VDI to the VM, and start it anew.

This is a very nice project I have been working on. The hardware at hand – two servers, with a shared SAS bus containing several SAS disks. Since it’s a shared bus, no RAID solution would cut it, and as I don’t want to waste disks with ASM (“normal” redundancy meaning half the size…), I went to ZFS storage.

ZFS is a wonderful technology, with many advantages, but with some dangerous pitfalls. As I prefer Linux, I did not bother with any Sloaris solutions, and went directly to Centos 6. I will describe my cluster setup below.

I will disclose the entire setup, including hardware layout, Linux platform, ZFS module parameters, the Redhat Cluster Suite ZFS agent I wrote and the cluster.conf configuration file. I will also share my considerations regarding some of the choices I made. In addition, this system was designed to act as NFS storage for Citrix XenServer pool, so I will have to describe the changed I had to perform on the XenServer itself (which might make it unsupported, but I will have to live with it), to allow it to handle the timeouts resulting by server failover.

So first – the servers – each having a single CPU (quad core), 24GB RAM, and dual 1Gb/s NICs. Also – a tiny internal SATA disk is used for the OS. The shared disks – at the moment, 10 SAS disks, dual port (notice – older HP disks might mark in a very small letters that they are only a single-port SAS disks…), 72GB, 10K RPM. Zpool called ‘share’ with two 5 disks RaidZ1 vdevs. As I mentioned before – ZFS seemed like the best possible option allowing me to achieve my goals at minimal cost.

When I came to this project, I wanted to be able to use a native ZFS cluster agent, and not a ‘script’ agent, which takes a very long time to respond (30 seconds). Also – I wanted to be able to handle multiple storage pools concurrently – each floating on its own. While I have only one at the moment, I wanted the ability to have a fine-grained control over multiple pools. In addition – I am unable (or unwilling?) to handle the multiple filesystems introduced with each pool. I wanted to be able to import or export the pool silently, and with a clear head, thus I had to verify that the multiple filesystems are not in use as part of the export process.

As an agent, I wanted to comply with Redhat Cluster Suite (RHCS from now on) OCF syntax. I used the supplied fs.sh script as an inspiration for my agent script, so some of it might look familiar. All credit goes to the original authors, of course.

The operating system I selected was Centos 6. Centos is based on Redhat Linux, and I find it mature and stable, which is exactly what I want when I plan a production-ready, enterprise-class storage solution. The version had to be x86_64, due to ZFS requirements, and due to the amount of RAM in the server.

To handle ZFS options, I added a file called /etc/modprobe.d/zfs.conf, with the following content

I had to verify there is no zpool.cache file. Since my pool was rather small (planned for 24 disks max), I was not concerned by the longer import process caused by not having the zpool.cache file. I was more concerned with automatic import process which might happen, and had to prevent it at almost any cost. In addition, I learned from other systems that the arc memory should never exceed half the RAM, and it should be given just a little under that.

Of course, when changing such module settings, you need to recreate initrd (dracut -f) to be on the safe side later on.

The zfs.sh agent script was placed in /usr/share/cluster directory. You must have rgmanager installed for this directory to exist, and anyhow, without rgmanager, you will have no cluster whatsoever.

This is the contents of the zfs.sh file. Notice that it is not compatible with Luci, so if you’re using it – them kids won’t play well together.

The reason I placed the IP address as the last to start and the first to stop was that the other way around, the NFS client would receive an ordered disconnection command, and would not bother to establish a connection with the remaining server. Abruptly taking away the clustered IP address causes the NFS clients to initiate a reconnection process, of which the systems are supposed to recover

I have left this article incomplete for a while now. It has some stuff I do like to share, so I am sharing it as-is. I will (some day) complete it.

Well, tricks is not the right word to describe advanced shell scripting usage, however, it does make some sense. These two topics are relevant to Bash version 4.0 and above, which is common for all modern-enough Linux distributions. Yours probably.

These ‘tricks’ are for advanced Bash scripting, and will assume you know how to handle the other advanced Bash topics. I will not instruct the basics here.

Trick #1 – redirected variable

What it means is the following.

Let’s assume that I have a list of objects, say: ‘LIST=”a b c d”‘, and you want to create a set of new variables by these names, holding data. For example:

a=1
b=abc
c=3
d=$a

How can you iterate through the contents of $LIST, and do it right? If you’re having only four objects, you can live with stating them manually, however, for a dynamic list (example: the results of /dev/sd*1 in your system), you might find it a bit problematic.

A solution is to use redirected variables. Up until recently, the method involved a very complex ‘expr’ command which was unpleasant at best, and hard to figure at its worst. Now we can use normal redirected variables, using the exclamation mark. See here:

for OBJECT in $LIST
do
# Place data into the list
export $OBJECT=$RANDOM
done

for OBJECT in $LIST
do
# Read it!
echo ${!OBJECT}
done

Firstly – to assign value to the redirected variable, we must use ‘export’ prefix. $OBJECT=$RANDOM will not work.
Secondly – to show the content, we need to use exclamation mark inside the variable curly brackets, meaning we cannot call it $!OBJECT, but ${!OBJECT}.
We cannot dynamically create the variable name inside the curly brackets either, so ${!abc_$SUFFIX} won’t work either. We can create the name beforehand, and then use it, like this: DynName=abc_$SUFFIX ; echo ${!DynName}

Trick #2 – Using strings as an array index

It was impossible in the past, but now, one of the most useful features of having smart list is accessible in shell. We can now call an array with a label. For example:

In this example we create array cells with the label being the name of the file, and populating them with the size (this is the result of ls -la 7th field) of this file.

This will work only if the array was declared beforehand using the following command (using the array name ‘array’ here):

declare -A array

Later on, it is easier to query data out of the array, as long as you know its index name. For example

FILE=ez-aton.txt
echo ${array[$FILE]}

Of course – assuming there is an entry for ez-aton.txt in this array.

The best use I found for this feature so far was for comparing large lists, without the need to reorder the objects in the array. I find it to boost the capabilities of arrays in Bash, and arrays, in general, are very powerful tools to handle long and complex lists, when you need to note the position.

That’s all fox. Note that the blog editor might change quites (single and double) and dashes to the UTF-8 versions, which will not go well in a copy/paste attempt to experiment with the code examples placed here. You might need to edit the contents and fix the quotes/dashes manually.

If you have any questions, comment here, I will be happy to elaborate. I hope to be able to add more complex Bash stuff I get into once a while

This post will describe the process of placing SSH keys using the internal ‘systemshell’ command of NetApp. As always – when doing something which the vendor did not intend you to do, do it very carefully. This data was obtained from NetApp forums, and while I do not have the original post to link (I usually link to the original, as a courtesy to the original author), this is the content, as is.

If you want to do it for any other user, just replace the word ‘root’ with the said user.

An additional note – I had to create a user to perform ‘df’ operations only. The purpose was to be able to obtain data using ‘ssh’ without disclosing the keys used for root SSH access, by having a very limited user, designed to do that.

I love XenServer. I love the product, I believe it to be a very good answer for SMBs, and enterprises. It lacks on external support, true, but the price tag for many of the ‘external capabilities’ on VMware, for instance, are very high, so many SMBs, especially, learn to live without them. XenServer gives a nice pack of features, at a very reasonable price.

One of the missing features is the management packs of hardware vendors, such as HP, Dell and IBM. Well, HP does have something, and its installation is always some sort of a challenge, but they do, so scratch that. Others, however, do not supply management packs. The bright side is that with Domain0 being a full featured i386 Centos 5 distribution, I can install the Centos/RHEL management packs, and have a ball. This brings us to another challenge there – the size of the system disk (root partition) by default is too small – 4GB, and while it works quite well without any external components, it tends to get filled very fast with external packages installed, like Dell tools, etc. Not only that, but on a system with many patches the patches backups take their toll, and consume valuable space. While my solution will not work for those who aim at the smallest possible space, such as SD or Disk-on-Key for the XenServer OS, it aims for the most of us all, where the system resides on several tenths of gigabytes at least, and is capable of sustaining the ‘loss’ of additional 4GB. This process modifies the install.img file, and authors the CD as a new one, your own privately-modified instance of XenServer installation. Mind you that this change will be effective only for new installations. I have not tested this as the upgrade path for existing systems, although I believe no harm will be done to those who upgrade. Also – it was performed and tested on XenServer 6.2, and not 6.2 SP1, or prior versions, although I believe that the process should look pretty similar in nature.

You will need a Linux machine to perform this operation, end to end. You could probably use some Windows applications on the way, but I have no idea as to which or what.

Step one: Open the ISO, and copy it to somewhere useful (assume /tmp is useful):

Change the value of ‘root_size’ to something to your taste. Mind you that with 4GB it was tight, but still usable, even with additional 3rd party tools, so don’t become greedy. I defined it to be 6GB (6144)

I did not test it, however there is a boot flag called stage2= which should lead to a new install.img file, other than the hardcoded one. I don’t if it will accept /images/install-new.img as its flag, but it can be a good start there.

One more thing:

Make sure that the vmlinuz and initrd used for any custom properties, in $CDROOT/isolinux do not exceed 8.3 format. Longer names didn’t work for me. I assume (without any further checks) that this is isolinux limitation.

The goal – connecting multiple Oracle ASM snapshots (same source LUNs, of course) to the same machine. The next process will demonstrate how to do it.

Problem: ASM disks use a disk label called ASMLib to maintain access even when the logical disk path might change (like adding a LUN with a lower ID and rebooting the server). This solves a major problem which was experienced with RAW devices, when order changed, and the ‘wrong’ disks took the place of others. ASM labels are a vital part in managing ASM disks and ASM DiskGroups. Also – the ASM DiskGroup name should be unique. You cannot have multiple DiskGroups with the same name.

Limitations – you cannot connect the snapshot LUNs to the same server which has access to the source LUNs.

Process:

Take a snapshot of the source LUN. If the ASM DiskGroup spans across several LUNs, you must create a consistency group (each storage device has its own lingo for the task).

Run ‘service oracleasm scandisks‘ to scan for the new ASM disk labels. We will need to change them now, so that the additional snapshot will not use duplicate ASM labels.

For each of the new ASM disks, run ‘service oracleasm force-renamedisk SRC_NAME TGT_NAME‘. You will want to rename the source name (SRC_NAME) to a unique target name, with some correlation to the snapshot name/purpose. This is the reasonable way of making some sense of a possibly very messy setup.

As the Oracle user, with the correct PATH variables ($ORACLE_HOME should point to the CRS_HOME) and the right ORACLE_SID (for example – +ASM1), run: ‘renamedg phase=both dgname=SRC_DG_NAME newdgname=NEW_DG_NAME verbose=true‘. The value ‘SRC_DG_NAME’ represents the original (on the source) DiskGroup name, and the NEW_DG_NAME represents the new name. Much like when renaming the disks – the name should have some relationship with either the snapshot name, so you can find your hands and legs in this mess (again – imagine having six snapshots, each of a DiskGroup with four LUNs. Now – this is a mess).

You can now mount the DiskGroup (named NEW_DG_NAME in my example) on both nodes

Assumptions:

Oracle GI is up and running all through this process

I tested it with Oracle 11.2.0.3. Other versions of 11.2.0.x might work, I have no clue about previous 11.1.x versions, or any earlier versions.

It was tested on Linux. My primary work platform. It was, to be exact, on RHEL 6.4, but it should work just the same on any RHEL-like platform. I believe it will work on other Linux platforms. I have no clue about running it on any other Unix/Windows platform.

The DiskGroup should not be mounted (no reason for it to be mounted right on discovery). Do not manually mount it prior to performing this procedure.

Good luck, and post a comment if you find this explanation either unclear, or if you encounter any problem.

The following procedure was tested by me, and was found to be working. The version of the XenServer I am using in this particular case is 6.1, however, I belive that this method is generic enough so that it could work for every version of XS, assuming you're using iSCSI and LVM (aka - not NetApp, CSLG, NFS and the likes). It might act as a general guideline for fiber channel communication, but this was not tested by me, and thus - I have no idea how it will work. It should work with some modifications when using Multipath, however, regarding multipath, you can find in this particular blog some notes on increasing multipath disks. Check the comments too - they might offer some better and simplified way of doing it.

So - let's begin.

First - increase the size of the LUN through the storage. For NetApp, it involves something like:

lun resize /vol/XenServer/luns/SR1.lun +1t

You should always make sure your storage volume, aggregate, raid group, pool or whatever is capable of holding the data, or - if using thin provisioning - that a well tested monitoring system is available to alert you when running low on storage disk space.

Now, we should identify the LUN. From now on - every action should be performed on all XS pool nodes, one after the other.

cat /proc/partitions

We should keep the output of this command somewhere. We will use it later on to identify the expanded LUN.

Now - let's scan for storage changes:

iscsiadm -m node -R

Now, running the previous command again will have a slightly different output. We can not identify the modified LUN

cat /proc/partitions

We should increase it in size. XenServer uses LVM, so we should harness it to our needs. Let's assume that the modified disk is /dev/sdd.

pvresize /dev/sdd

After completing this task on all pool hosts, we should run sr-scan command. Either by CLI, or through the GUI. When the scan operation completes, the new size would show.