Wednesday, October 18, 2017

It's possible to move all the physical disks involved in a S2D deployment to different servers and to bring the data online! I tested this out by installing S2D on a three node cluster. I shutdown all the nodes and pulled the disks. I then reinstalled the OS on all three nodes and put the disks back in. I setup the new cluster w/ different server names and cluster name and then ran "Enable-ClusterS2D". The old storage pool and disks could be seen. The storage pool was in a read-only state and the disks were detached. To get the data online I did the following:

Get-StoragePool *s2d* | Set-StoragePool -IsReadOnly $false

Get-VirtualDisk | Connect-VirtualDisk

Then I went to the cluster manager, right clicked on Pools and "Add Storage Pool". Then I right clicked on disks and "Add Disk". At this point I added the disks to CSV and was able to access all the data.

Tuesday, September 12, 2017

After applying Windows Server 2016 September's patch (KB4038782) to a node in an S2D cluster, the disks on that node would not come out of maintenance mode after the node re-joined the cluster and after Resume -> Fail Roles Back. The VMs would move back but the disks would stay in maintenance mode, thus causing the virtual disks to show a status of degraded. I had to manually take the disks out of maintenance mode after the node joined the cluster and after I failed the roles back. To see if the disks are in maintenance mode run: Get-StorageFaultDomainUnder the OperationalStatus column it will say "In Maintenance Mode" by the disks for the node that was just restarted. I don't know if this issues was/is just specific to me or if it may happen to everyone that applies the patch. To take the nodes out of maintenance mode use the Disable-StorageMaintenanceMode command. I have a smaple here that gets all the disks in maintenance mode and disables maintenance mode for those disks.

Tuesday, June 20, 2017

Storage spaces direct is great, but every once and a while a S2D storage job will get a stuck and just sit there in a suspended state. This usually happens after a reboot of one of the nodes in the cluster.

What you don't want to do is take a different node out of the cluster while a storage job is stuck and while there are degraded virtual disks.

So what does one do if a storage job is stuck? There are two commandlets that I've found will fix this. The first is Optimize-StoragePool. The second is Repair-VirtualDisk. Start with Optimize-StoragePool and if that doesn't work then move on to Repair-VirtualDisk. Here is how you use them:

Get-StoragePool <storage pool friendly name> | Optimize-StoragePool

Example: Get-StoragePool s2d* | Optimize-StoragePool

Get-VirtualDisk <virtual disk friendly name> | Repair-VirtualDisk

Example: Get-VirtualDisk vd01 | Repair-VirtualDisk

Usually optimizing the storage pool takes care of the hung storage job and fixed the degraded virtual disk but if not target the disk directly.

Update: I tried Repair-ClusterS2D. It does not appear to help with this scenario. There is limited documentation on it but it looks like it's something you use if a virtual disk gets disconnected or something.Update: Run Get-PhysicalDisk. If any of them say they're in maintenance mode, this could be the cause of your degraded disks and your stuck jobs. This seems to happen when you pause and resume a node to close together. To take the disks our of maintenance mode run the following:Get-PhysicalDisk | Where-Object { $_.OperationalStatus -eq "In Maintenance Mode" } | Disable-StorageMaintenanceMode

Thursday, February 23, 2017

In storage spaces direct you can run Get-StorageJob to see the progress of rebuilds/resyncs. The following powershell snippet allows you to continually refresh the status of the rebuild operation so that you know when things are back to normal.

function RefreshStorageJobStatus () { while($true) { Get-VirtualDisk | ft; Write-Host "-----------"; Get-StorageJob;Start-Sleep -s 1;Clear-Host; } }
Enter the above in powershell on one line. Then enter "RefreshStorageJobStatus" to start the script. The output should look similar to the following and refresh every second:

Monday, February 13, 2017

Is it a supported scenario to run a AD domain controller in a VM on a hyper-converged S2D cluster? We're looking to deploy a 4-node hyper-converged S2D cluster at a remote site. We would like to run the domain controller for the site on the cluster so we don't need to purchase a 5th server. Will the S2D cluster be able to boot if the network links to the site are down (meaning other domain controllers are not accessible)? I know WS2012 allowed for AD-less cluster bootstrapping but will the underlying mechanics uses for storage access in S2D in WS2016 work without AD? Is this a supported scenario? AD-less S2D cluster bootstrapping?

I asked this question in the Microsoft forums. I did not get a definitive answer from anyone. So I set it up and tested it and it appears to work. I don't know if it's officially supported or not but it does work. The S2D virtual disks and volumes comes up with out a domain controller. At which point you can start the domain controller VM if it did not start automatically. I didn't dig into things, but I have a feeling it's using NTLM authentication and would likely fail if your domain requires Kerberos?