Its been a while since I last blogged, but here is something that might interest you. To spice things up I even recorded a quick video blog of the same information you will find written here. So skip the video and keep reading, or watch the video and skip the reading, the choice is yours!

For many years I have used the ‘Supported Hardware List’ websites from IBM to qualify SVC support. If you want to know if an Infinidat is supported behind IBM SVC, which version and what code level, it’s all there.

So traditionally I would go to here. From there you get a great list of code levels, choose your code level and then look up your product:

However I have always had some tiny misgivings about these sites. After all, they obey no law of sorting I have ever seen. Alphabetical order any one? It’s like the Web Admin is a worshiper of Cthulhu and has managed to translate non-euclidian geometry into a list of vendor names. Or maybe the site is just TOO HARD TO MAINTAIN.

Take a look and suggest a logic to this list of vendors (please I beg you):

But there is a bigger problem: Theses sites are just slightly out of date.

Lets use that example I first raised, if look here I find Infinidat version 2.0 is supported with SVC 7.8:

But if I then go to the SSIC here. I get told version 3.0.x is also supported:

This was not a one-off, I found multiple products where the SSIC seemed to reflect newer information than the Supported Hardware websites.

Moral of the story? Always use the SSIC to confirm support, not the Supported Hardware pages.

In this blog post I want to inform you about Actifio’s presentation to Tech Field Day 11.

Tech Field Days are one of my favourite technical information sources. They involve a group of prominent bloggers and industry personalities being given a briefing and demonstration (normally somewhere between 2-4 hours) by an IT company about their products and viewpoint. It is a chance for both IT entrants and established IT Companies to tell their story, explain the why, have their ideas challenged by some smart people and get some relatively free publicity, while the bloggers get the chance to gather material to write blogs, learn about our rapidly challenging world and sometimes show how clever and insightful they are at the same time. It is a genuine win-win for everybody.

Actifio were last at Tech Field Day 4 in 2010, which explains why competitive information about Actifio is often so laughably wrong. I think other Vendors watch these (in IT terms), ancient videos and presume nothing has changed since! The good news for Actifios competitors and prospective and existing customers is they can now update their knowledge of Actifio by watching Actifio present at Tech Field 11 in 2016

The really nice thing is that the presentations have been split into five easily consumed videos, each about 20 minutes long. So please drop by the Tech Field Day page, take a look at the presented subjects and learn about Copy Data Management and how Actifio’s products bring a new and unique way for our customers to move to the hybrid cloud, dramatically improve their agility and modernise their business resiliency.

To make it easy, I have reposted all the Actifio video links below, but you can also get to them from here where you can also check out the other vendors who presented.

Using SSH keys to perform password-free login is quite common in Unix hosts and in Appliances that have embedded Unix (like Storwize products).

You effectively have a public key which is shared and a private key (usually with a PPK extension) that is not shared. Think of the public key like the lock in your front door, that everyone can see. Think of the private key like the door key in your pocket or hand-bag. If you keep your private key secure, your door is relatively secure. If you lose your keys, your door is most likely no longer secure (unless they are down the back of the couch).

Sticking with the door analogy, the risk with a door lock is that someone could still just try to kick your door in (brute force attack) or pick your lock. The bit length of the key can make this harder to achieve: the longer the bit length the harder it is to crack.

It is not unusual to see instructions that suggest you use a command like this to generate keys, where a bit length of 1024 is specified with an RSA key:

ssh-keygen –b 1024 -t rsa -f ~/.ssh/id_rsa

Of if using PuTTYgen to create the keys, to see instructions like this:

The problem is that these instructions are all old. In fact using the ssk-keygen command syntax example shown above would represent a down-grade in what is now the default setting. The wiki and man pages for ssh-keygen both confirm that for RSA, the default length is now 2048 bits (not 1024 bits).

To confirm what key length you get by default, simply make a test key and then read it back. In this example I create a new public/private key pair called testkey without specifying a bit length (there is no -b 1024):

When using PuTTYGen, if you use a recent version you will note that the default bit length is now 2048 (as indicated by the red circle). If you load a key you should see the bit length of the loaded key as indicated by the orange circle.

So if you see instructions specifying the creation of a 1024 bit key, I suggest you ignore them and use 2048 bits or at the least question this with your vendor. Equally if you are using older keys, it is well worth checking their bit length and generating new keys, since this will give you the now default bit length of 2048, but also renew them, reducing the risk of someone using an older (and potentially leaked) key inappropriately.

I want to draw my Australian friends to an app called “Emergency +”, available for your smart device (Apple, Android and Windows).

The scenario is simple. You see something terrible: a fire; a car crash; a natural disaster. The standard response is simple: You should dial 000 (the 911 equivalent in Australia).

One of the first things you are asked is usually:

“What is your location? What are you?”

Now that’s easy if you are at home…. but what if you are on the road, or at a store, or walking the dog?

The idea is to eliminate confusion over your location.

First you open the App and see this:

Select the map and determine your location:

Then dial 000 using the App (you will get a pop-up like this):

It will start a phone call, at which point you should switch to speaker mode (hands free) and jump back to the app. You now have your address and your exact location (to a number of meters) for you to share with the responder on the phone.

I have blogged in the past about the classic IT Story, The Cuckoo’s Egg by Clifford Stoll. A true story that details how Clifford discovered a hacker while trying to account for 9 seconds of mainframe processing time.

I was reminded of this recently while doing an MSP Space Accounting project. MSPs (Managed Service Providers) are understandably cost focused as they try to compete with low-cost IAAS (Infrastructure As A Service) providers like Amazon. To control costs, shared resources are normally employed as well as thin-provisioning and its cousin over-provisioning (don’t confuse them, thin-provisioning just means using only the exact resources needed for an objective, where over-provisioning means promising or committing to more resources than you actually have, in the hope that no one calls your bluff. You can always use thin-provisioning without using over-provisioning).

A Storwize pool can use both thin and over-provisioning. As an MSP if you are looking at pool usage you may want to be clear exactly how much space each client in the shared pool is using. Now I don’t want to burn time explaining the exact workings of thin provisioning (something that Andrew Martin explains very well here), but I wanted to point out a quirk that may confuse you while trying to do space accounting.

In this example I have a Storwize pool that is 32.55 TiB in size and is showing 22.93 TiB Used. You can clearly see we have over-allocated the 32.55 TiB of disk space by having created 75.50 TiB of virtual volumes!

Now this is significant because if I wanted to do space accounting I would expect the Used capacity of all volumes in the pool to sum up 22.93 TiB of space. In other words if five end clients are sharing this space and I know which volumes relate to which client, I would expect the sum total of all volumes used by all clients to equal 22.93 TiB.

If I bring up the properties panel for the pool I can clearly see metrics for the pool including the extent size (in this example 2.00 GiB, remember that, it is significant later).

Now for each thin provisioned volume I get three size properties:

Used: 768.00 KiB

Real: 1.02 GiB

Total: 100.00 GiB

To explain what these are:

Used capacity is effectively how much data has been written to the volume (which includes the B-Tree to track thin space allocation).

Real capacity is how much space in grains has been pre-allocated to the volume from extents allocated from the pool.

Total capacity is the size advertised to the hosts that can access this volume.

This means I could sum either Used capacity or Real capacity. Since Real capacity is always larger than Used capacity, it makes more sense to sum this. Especially if this is the number I am using to determine usage by clients inside a shared pool.

To get the used space size of all volumes we need to differentiate between fully provisioned (Generic) volumes and Thin-Provisioned volumes.

This command will grab all the Generic volumes in a specific pool (in this example called InternalPool1):

So for the generic volumes we can sum the capacity field. In this example pool, I used a spreadsheet and found it sums to 19,404,662,243,328 byes

So for the thin volumes we can sum the real capacity field. In this example pool, I used a spreadsheet and found it sums to 5,260,831,053,824 bytes.

This brings us to a combined total of 24,665,493,297,152 bytes which is 22.43 TiB.

The problem here is obvious. I expected to account for 22.93 TiB of space, but summing the combined total of actual capacity for full-fat volumes and real-capacity for thin volumes doesn’t add up to what I expect. In fact in this example I am short by around 0.5 TiB of used capacity. How do I allocate this space to a specific client if no volume owns up to using it?

I can actually spot this in the CLI as well using just the lsmdiskgrp command. If I subtract real capacity 24,665,493,297,152 from total capacity 35,787,814,993,920 I get 11,122,321,696,768 bytes, which is nowhere near reported free capacity of 10,578,504,450,048 bytes. This again reveals 543,817,246,720 bytes (0.494 TiB) of allocated space that is not showing against volumes.

The answer is that the space is actually allocated to volumes, but is not being accounted for at a volume level. If you scroll up to the second screen shot showing the Pool overview you can see the Extent Size is 2 GiB. That means the minimum amount of space that gets allocated to a volume is actually 2 GiB. But if we look at the volume properties of a single volume, there is no indication that this volume is actually holding down 2 GiB of pool space. In this example I can see only 1.02 GiB of space being claimed. So for this example volume there is actually 0.98 GiB of space allocated to the volume which is not actually being acknowledged as being dedicated to that volume.

So how do I cleanly allocate this 0.5 TiB?

I see two choices. The first is to simply determine the shortfall, divide it by the number of thin allocated volumes and then add that usage to each thin volume. In this example I have 519 thin volumes, so if I divide 543,817,246,720 by 519 thats pretty well 1 GiB per volume I could simply add to that volume’s space allocation.

The second is to accept it as a space tax and simply plan for it. The issue is far less pronounced if the volume quantity is small and the volume size is large. The issue is also far less pronounced with smaller extent sizes. At very small extent sizes it in fact will most likely not occur at all or be truly trivial in size (like Clifford’s 9 seconds). In this example simply using 1 GiB extents would have pretty well masked the issue. But remember that the smaller your extent size, the smaller your maximum cluster size can be. A 2 GiB extent size means the maximum cluster size is 8 PiB.

As a follow-up to my previous post about MPIO software and RDMs, I suggested SDDDSM could help you map Windows volumes to Storwize volumes. This led to the obvious question: What about Linux VMs?

In a distant time there was a version of IBM SDD for Linux (in fact you can still download it). But because it was closed source and used compiled binaries, it meant that users could only use specific Linux distributions/Kernel versions. This was rather painful (especially if you upgraded your Linux version due to some other bug and then found SDD no longer worked). Fortunately native Multipathing for Linux rapidly matured and offered a simple and native option that is definitely the way to go (and please don’t listen to the vendors pushing proprietary MPIO software, integration native to the Operating System using vendor plug-ins is in my opinion the only acceptable MPIO solution).

Either way, it turns out you don’t even need multi path software to map a Storwize Volume to an Operating System device.

In this example I have created a volume on a Storwize V3700 with a UID then ends in 0043.

It is mapped as a pRDM to a VM, I can see the same UID under the Manage Paths window. You can see the same UID at the top of the window (ending with 0043).

On the Linux VM that is using this VM, I want to confirm if the device /dev/sdb matches the pRDM. In this example we use the smartctl command. We can clearly see the matching Logical Unit ID (ending in 0043), so we know that /dev/sdb is indeed our pRDM.

If you find smartctl is not installed, then install the smartmontools package:

yum install smartmontools

If we have Linux multipath configured, we can also use the multi path -l (or -ll) command to find the UID and determine which Storwize Volume is which Linux device. Again I can easily spot that mpathb (sdb) is my Storwize volume with the UID ending in 0043.

I got a great question the other day regarding VMware Raw Device Mappings:

If an RDM is a direct pass though of a volume from Storage Device to VM, does the VM need MPIO software like a physical machine does?

The short answer is NO, it doesn’t. But I thought I would show why this is so, and in fact why adding MPIO software may help.

First up, to test this, I created two volumes on my Storwize V3700.

I mapped them to an ESXi server as LUN ID 2 and LUN ID 3. Note the serials of the volumes end in 0040 and 0041:

On ESX I did a Rescan All and discovered two new volumes, which we know match the two I just made on my V3700, as the serial numbers end in 40 and 41 and the LUN IDs are 2 and 3:

I confirmed that the new devices had multiple paths, in this example only two (one to each Node Cannister in the Storwize V3700):

I then mapped them to a VM as RDMs, the first one as a Virtual RDM (vRDM), the second as a Physical (pRDM):

Finally on the Windows VM I Scanned for New Devices and brought up the properties of the two new disks. Firstly you note that the first disk (Disk 1) is a VMware Virtual disk while the second disk (Disk 2) is an IBM 2145 Multi-Path disk. This is because the first one was mapped as a vRDM, while the second was mapped as a pRDM.

So here is the question, if the Physical RDM is a multi-path device, does it have one path or many? The first hint is that we only got one disk for each RDM. But what do I see if I actually install MPIO software? So I installed SDDDSM and displayed path status using the datapath query device command

What the output above shows is that there is only one path being presented to the VM, even though we know the ESXi HyperVisor can see two paths.

So this proves we didn’t actually need to install SDDDSM to manage pathing, as there is only one path being presented to the disk (the HyperVisor is handling the multiple paths using its own MPIO capability VMW-SATP-ALUA, which we can see in the ESXi pathing screen capture further up above.

Having said all that, there is one advantage from the Windows VM perspective to have SDDDSM installed, which is that I can see that Disk2 maps to the V3700 volume with a serial that ends in 40 (rather than 41). So If I wanted to remove the vRDM volume (Disk 1) I know with safety that the volume ending in ’41’ is the correct one to target.