The IT TeQnicianhttp://www.the-teqnician.nl/wp
Fri, 05 Oct 2018 13:49:20 +0000enhourly1https://wordpress.org/?v=4.9.993729533Azure Stack Secret/Certificate Rotationhttp://www.the-teqnician.nl/wp/2018/09/azure-stack-secret-certificate-rotation/
http://www.the-teqnician.nl/wp/2018/09/azure-stack-secret-certificate-rotation/#respondMon, 24 Sep 2018 15:17:45 +0000http://www.the-teqnician.nl/wp/?p=970Read moreAzure Stack Secret/Certificate Rotation]]>Whoh…! It has been a very long time since I wrote a blog on my site, last blogs were from Ignite, and the 2018 edition is about to start. So time to end the blog silence

Azure Stack Certificates

Certificates all have there lifetimes fortunately otherwise it will miss it’s goal entirely, so it’s inevitable that we have to rotate some certificates on Azure Stack. I had to rotate the public certificates recently. A public certificate on a multi node Azure Stack POC environment that is of the type Multi SAN Wildcard certificate. Best practice is to have a separate Wildcard certificate for the different roles of Azure Stack but since this is a POC al names are in one certificate. A lot about the Certificate requirements is describe here.

In this blog I am not explaining how to create the CSR or request the certificate, this is just about testing and rotating the public certificate. More details about generating a CSR can be found here

Prepare Certificate folder

Azure Stack expects a certain folder structure for all certificates and some properties on the .PFX file. The test tool will check for this. There is a powershell file on GitHub called CertDirectoryMaker.ps1 that you could use to create the folder structure. Then add your certificates to the right folder. In this case it was simple 1 certificate in all the folders.

Make sure you create a .pfx file with the current options enabled.

The CertDirectoryMaker tool also creates 2 folders for the Host Extension feature that is coming. The current Azure Readiness Checker tool does not expect them yet, so you need to remove these folders, otherwise the tool will complete it has 2 folders that does not suppose to be there. Below an example of a certificate that is not exported the correct way.

Now share this folder and make sure the ERCS server is able to reach it.

Test Certificate

With the folder in place and the certificate exported with the right properties the .pfx files can be copied to the folders created with the CertDirectoryMaker tool. Then run the following command. In my case we use AAD as identity, if you use ADFS you need to change the IdentitySystem parameter.

When everything is as it should you will receive several OK’s for all certificates and your are ready to go to the next step. In the example below I used a UNC path to test the certificate, because I used the same path in the certificate rotation later on. But you can also test the certificate locally and copy it to a share after the test.

Test Azure Stack

I recommend you also run a test Azure Stack test to validate if all the Azure Stack Role are healthy and there are no issue’s. When there are issues I recommend you resolve those first before running the certificate rotation. Running the Test-Azurestack CmdLet requires you to login to a PEPSession. Below an example to login and start the test

After the test I should be all green “PASS” lines and your good to go!

Update!

When I created this blog I justed finished the cert rotation and in the mean while updated to 1808. But unfortunally the update failed and after a good week of throubleshooting with MS support it turns out there are some issue’s with the certificate/secrate rotation process. There are some processes that do not run which will bite you in the ass when you update to 1808. So be prepared with running this update first, before updating to 1808!

Rotate Secret

After al tests are green we can proceed with the certificate rotation. In this case I am replacing the External and Internal secrets. Rotating only internal secrets is described here. It’s recommended to start this process during off work hours. Normally al workloads will be online during the rotation process, but you might want to freeze changes to the environment, so no deployment or altering stuff from the portal, powershell or API’s.

It’s important to use the PEP sessions as a variable and not entering the session directly. Also beware, it takes a lot of time!! In my case almost 10 hours and I hear it from more cases! So if you start it, be very patient, also when the constantly refreshing output is showing the same status for hours, keep it running and leave it doing it’s thing!

When the command is executed, it verifies the certificate first, it seems like on 1807 (the Azure Stack version I had at the moment) already has some Host Extension stuff in there it can’t find, but it can continue without issue’s.

When the secret rotation is completed a lot of logging is spit out of the powershell console, you could skim over it but it’s basically a large log of all that has been done. If it’s completed your have successfully rotated your secrets! You could do another test-azurestack to make sure everything is online.

If it went bad, don’t retry it but contact support to figure out what is wrong with it instead of retrying over and over.

If you have any questions or comments, leave them below or reach out by mail or twitter

Azure High Performance Networking

This was a very interesting session with lots of good info. It started of wit VNet integration of Azure Container Service and the ability to give an IP to a single container instead of sharing the IP with several containers.

VNet Service endpoints is also new which gives you the ability to deny internet access to VM’s but allow specific Azure services as Endpoint. So your VM’s can talk to Azure Services or Paas Services without you trying to figure out behind what IPs the endpoints are located and talking to the rest of the internet.

Then NSG’s got a bit less dumber then they were. The applied service tags to NSG’s. So what it means is that you can for example set a tag SQL Servers, or IIS Servers and make all IIS or SQL Servers being tagged by the policy. So you setup one rule with a tag SQL and all your SQL servers wil be bound to that NSG rule instead of creating several rules based on source IP’s of that SQL server.

Then Vnet Peering cross region came along and the enouncement that it’s at public preview. You can now peer Vnet’s from different regions which was not possible before. I found out that is was in Private preview and to be announced in public preview on the same day

Azure Site Recovery

Earlier this week I got the change to talk with some Azure Site Recovery Product managers to talk about feedback and use cases. Near the end I was told that my frustration against the fact that some features were available for VMWare to Azure, Physical Host to Azure, Azure to Azure DR/migration but not for Hyper-V to Azure DR/Migration will come to an end. Turns out that it will end near the end of next month. Features will be the same for all platforms finally! Also a big complaint from my side was the fact that there is no compression for the Hyper-V MARS agent available. That one is also coming thank you!

As for this session they did a talk about Application Level DR which was not new to me. But it was also about monitoring ability’s which are massively going to improve. They got a very cool dashboard! (sorry for the bad pictures, I was not sitting in the front )

Now you can get tons of info about replication, RPO, Latency, Bandwidth, Churn Rate (how much change there is per replication interval) error logging and also a Topology view! I like it!

Software Defined in version 1709 what’s new

In this session with Jeff Woolsey and Claus Joergensen it was about what’s already there in 2016 and we all know that. It’s about whats new in 1709 and going a bit deeper!

As I blogged before on Day 2, with 1709 we can use HDD, SSD, NVME and NVDIMM for Storage Spaces Direct. The also mentioned that NVME uses less CPU cycle’s which is also really cool with HyperConverged because you need them for all the workloads. With NVDIMM or SCM the access times and latency is not in milliseconds but in nanoseconds.

You can use the SCM as Cache or Capacity although they are small in size for now as the max NVDimm is 16GB in size. And since your are filling up your regular memory slots you have to find a sweet spot in memory for RAM and SCM. But if you go for a disaggregated approach with SOFS i would stuff the sucker full with NVDimms and it as fast as you will ever imagine! As for S2D there are also improvements on the drive error handling and MVR disks. As I blogged here MVR disks are really slow, they claim there 4 times faster now. Currently I don’t have the systems available to test it out.

That’s it for Day 4!

Greetings,
Pascal Slijkerman

]]>http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-4/feed/0904Ignite 2017 @ Orlando Day 3http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-3/
http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-3/#respondWed, 04 Oct 2017 19:22:34 +0000http://www.the-teqnician.nl/wp/?p=895Read moreIgnite 2017 @ Orlando Day 3]]>The third day at Ignite was kind a hard to start up, it were long day’s and fun long nights but 2 double espresso kind a pushed me out of my morning dip. Ready to start the day!

Azure High Performance Networking:

This sessions was initially not about new stuff. It’s was more to make things more clear about Azure networking. Near the end there was a lot of new stuff about ExpressRoute though!

Public and Microsoft Peering

Earlier I hear some noise from several people that the Office 365 peering, or Public Peering was to be canceled. But now we know that it’s not cancelled but that the 2 peerings have merged. That makes things simpler, but also more complex, because one of the most issue’s I hear customers talking about is that they don’t want to peer with all Azure or Office 365 services and now there is no choice in those either. It’s either none ore all in! But Microsoft must have heard this complaint because the came up with a new feature for ExpressRoute called Route Filters. With the filters you can choose what routes you want advertise to use only the service you want over the ExpressRoute connection. Nicely done!

Finally monitoring on ExpressRoute!

Monitoring is the next thing that is announced for ExpressRoute. Most of the devices are not managed by companies and not all ISP’s offer monitoring services. Now with the new ExpressRoute Network Performance Monitor (NPM) you can get a lot more insight in your ExpressRoute connection. You can monitor latency and throughput now within the Azure portal. They also included a Topology Dashboard to view your topology from onprem to azure and al hops it’s taking. You can do some tests to from certain endpoints. Really making progress here!

But there is more a new view is created to view some circuit history to view the stability of your connection. If you want al this stuff you have to take an OMS license, that is the only part I am disappointed about. It should have been included in the ExpressRoute package if you ask me.

Azure Stack Security and Compliancy

Ever since it got launched I wanted to figure out how it all works together with Azure Stack turns out it was a wasted of time according to Jeffrey Snover I quote: “Internals are internals! We will change a lot of stuff in the future and we aren’t going to tell or document the internals”.. oké understood. As an Ops guy I don’t like it, but hey.., its a new age..

The sessions had some overlap with a session from the day before, it was more about the compliancy and why they closed it down as much as possible. Overall most of the Azure Stack session had a lot of redundant information in it. It think they also tried to spread the load a bit but most of the sessions ware not that crowded as it is still really new to a lot of people.

Azure Stack and DR

So it you have you goodies delivered by either HPE, Dell, Lenovo or Cisco and are fully in production things could go ugly. As for the Azure Stack it self there are some DR procedures you have to take so you are ready to get it up and running again when shit hits the fan!

Lets start with what the Azure Stack DR backup is not doing. That is making a backup of your IaaS or PaaS data you are in charge to arrange something for that. Also switch config and OEM data. That is the responsibility of the OEM. It does give you the ability to create backup sets that are encrypted because it contains sensitive data. You have to enable the DR Service through powershell. The data is placed on an external file share. It are all full backups with data sets of approx. 10GB. You need about 1TB of storage for one Azure Stack system and data is saved for 7 days as recommendation. After 7 days it’s a manual deletion.

Once backup is configured you can view the status in the portal under the Backup Resource provider or in the File Share.

There are several vendors how will over IaaS and/or PaaS protection like Veeam, Comvault, Carbonite and more. Beware, it’s al very new and building in progress.

That’s it for Day 3 then!

Cheers!
Pascal Slijkerman

]]>http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-3/feed/0895Ignite 2017 @ Orlando Day 2http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-2/
http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-2/#respondWed, 04 Oct 2017 18:33:53 +0000http://www.the-teqnician.nl/wp/?p=876Read moreIgnite 2017 @ Orlando Day 2]]>The second day of Ignite i started of with a session on:

Azure Stack servicing and updating.

Updates for Azure stack consist of 2 packages or actually 3, but the third is different and not really clear how that is taking place because it will be the OEM vendor package and all vendors can take care of that in their own way. So the first package is for the OS updates for all the VM’s and hosts in the Azure Stack. The second package is about updating the resource providers in Azure stack. The Azure stack can be updated in a disconnected scenario as long as the bits are downloaded and uploaded in the blob storage through the Admin Portal.

Both are pretty big and not yet cumulative. Meaning that you have to run all the updates to get to the latest and you can’t skip an update or something. Updates will be every month and you should not supposed to fall behind more ten 3 months otherwise you will loss support and have to be current first.

Since the entire stack is locked you cannot login with RDP and go to Windows Update and click install updates. To take care of that Azure Stack has an Update Resource Provider. The resource provider gives an wizard in a set of blades to provide a destination to the update packages and install the update or schedule it.

Downtime:

During updating the admin portal goes down for about 10 minutes since it is only one VM. The customer portal how ever stays online because it’s redundant in the stack. The rest of the stack will also remain online as much as possible. The update process has a ton of checks and validations to update as graceful as possible, however if a VM does not live migrated the update process takes down a step in gracefulness. After live migration doesn’t work it will try a quick migrations, which means the VM is in save state for several seconds to a minute. Meaning no communication with that VM.

OEM Updates:

Every Azure Stack setup has a minimum of 4 hosts up to 12 max at the moment of writing. Every vendor add’s a 1u host that acts as OEM support server. The OEM support server is also called HLH. With this server the OEM vendor takes care of the Firmware and driver updates for the hosts but also the switch updates.

What more:

The PoC azure stack software has no update feature, because there is nowhere to place the VM’s if the host is restarting. Also will there be a maintenance window to schedule the update off work hours. Based on the size of the stack and the amount of the activity on the system the update process can last from 1,5 hour to 6 hours or more.

Azure stack delivery and Operations:

Planning and sizing:

Since Azure Stack is RTM and customers can order the systems there are 2 documents available to help in seizing and planning. The seizing sheet is more about how many storage do i need and what type, how many CPU Intense VM’s and or memory intense VM’s. This will be the Azure Stack Resource Calculator that will be launched some were in the end of October to help you out with the sizing. The planning sheet is more about how you want your current network infrastructure to interact with your own Stack. So for example what IP subnets do you want to use for certain networks, what your public IP space will be and what your current netwerk subnets are like. But also how to take care of identity, are you in connected or a disconnected scenario and much more.

Operating:

Like I mentioned before, the Azure stack system cannot be managed like every other VM or Host. It’s like an appliance, you have a portal and an cli interface (the admin portal and powershell) to manage it. This is because Microsoft and the OEM vendors are fully committed to make it as reliable and secure as possible and this is there approach to make it happen. Hardend by Default and Assume breach are key words during the session. You are able to do maintenance and trouble shooting through the portal and with Powershell and Just Enough Administration but it’s mainly to troubleshoot issue’s with Azure Stack and the integration with your own environment. For everything else you have to call Azure Stack Support.

Backup:

Backup of the entire stack was not part of this session, this was more about the stack it self. There are some valuable aspects that you need to backup, like the certificates, some vault keys and more. Backups will be placed in a remote file share outside of the Stack itself. Not to much info on that currently but there is some more in the Day 3 blog.

During Ignite 2017 i have seen a lot of sessions talking a lot about or touching project Honolulu. This sessions was mostly about Storage Spaces Direct and Honolulu. Since i already talked about Honolulu on this blog “Ignite 2017 @ Orlando Day 1” i am not repeating it and will stick to Storage Spaces possibilities with Honolulu.

So if you add your storage space direct cluster to honolulu you get a really rich set of monitoring feature.

With the dashboard you can view things like IOPS, Latency and Throughput real-time, over the past hour, days, weeks, months and even years! For the entire server or per disk. You can get info about the volumes and error messages for disks, volumes, nodes or the cluster.

You can also use Honolulu for the management, like creating new drives, expanding them or removing them. The wizard is full with options and the goals is to make it an almost all you need management tool. I like it!

New features:

One of the key features of Windows Server 2016 Redstone 3 or 1709 however you want to call it in terms of storage is ReFS and support for Deduplication. It’s one of the key features that other SAN vendors do offer but Windows since the coming of Deduplication on NTFS never supported. You were allowed to run DPM in a VM and VDI, but production workload VM’s were not supported. Until today is still not clear to me, in the session of Jeff Woolsey and Claus Joergenson they didn’t mention it.. I still need to get it clear. For DPM workloads it’s a great new future and removes the need to create a virtual DPM server just to use deduplication.

The other epic new feature for storage spaces is the support for Storage Class Memory, or SCM, or NVDIMM, or persistent memory or…. Well i think there are names enough for it. It comes down to this it’s epic fast! More on this in a later blog!

Beware the Mem size of the NVDimm is the total, not the size per NVDIMM, that is a bug in Honolulu that is part of the preview experience

Well that’s about it for my Day 2 of Ignite. Sorry the blog post are a bit later then I wanted them to be online

Greetings
Pascal Slijkerman

]]>http://www.the-teqnician.nl/wp/2017/10/ignite-2017-orlando-day-2/feed/0876Ignite 2017 @ Orlando Day 1http://www.the-teqnician.nl/wp/2017/09/ignite-2017-orlando-day-1/
http://www.the-teqnician.nl/wp/2017/09/ignite-2017-orlando-day-1/#respondTue, 26 Sep 2017 03:21:41 +0000http://www.the-teqnician.nl/wp/?p=864Read moreIgnite 2017 @ Orlando Day 1]]>Today was the first day of Ignite 2017 which was about to kick off with a key note from Satya Nadella. Unfortunately it was a lot of the same slides and info as from Inspire 2017, so it was a bit of a waste of time, and since we had a lot of drinks at some very nice places in Orlando and a sprinkler fight with some InSpark colleague’s the night before it would have been nice to get a couple of more hours of sleep .

Empower IT and Developer Productivity with Azure
After the keynote i started with the session from Scott Gutherie. It was packed with info but a couple of things besides the session from Corey with Massive VM sizes with 128 Cores and multiple Terrabytes of memory were interesting to me:

Update management:
Update Management is in preview now, and as i noticed in my own subscription not available for all machines, don’t no the prereqs for that yet. But you can enable Update management to scan vm’s for updates it needs on Windows and Llinux. You can also include Onprem Machines. It’s then displayed in a nice dashboard

Change Tracking
With Azure Change Tracking in the OMS suite you can track changes in a VM through Log Analytics on a big nummer of resource. For example on File level, Registry, process and service level. Here to a slick displayed dashboard to get a good overview of what happend.

After a horrible lunch experience the real sessions would start. Here is a quick overview with some valuable take away for myself within my focus

Virtual Machine Diagnostics on Microsoft Azure
This was a short 20 minute session in the OCCC South hall Expo Theather #10. A new powershell script is release to get the health from a VM and output it to a json formated overview. With Get-AzureRMVmHealth.ps1 you can get a quick overview of several details like is my nic up, whats the ip, what port is used for RDP, is the admin account disabled, whats the username, are all vital services for remote access running and lots more! Give it a try with the following command

Get-AzureRmVmHealth.ps1 -resourcegroupname “RSGname” -Name “VMName”

Another new command is available in AzureRM.Compute 3.4.0 which is Invoke-AzureRMVmRunCommand. With this command you can correct error’s that you noticed in the output from the Get-AzureRmVmHealth.ps1. For example if you discover that the admin account is disabled, you can enable it. Or if you find out that the RDP port is for some reason changed to 3390 instead of 3389 you can change it with the Invoke-AzureRmVmRunCommand to restore access to te VM again. Really Cool stuff.

There was also a little part about the serial output option in azure for Linux VM’s. Instead of a screenshot that updates every minute you get a realtime output now! Really helpfull.

Cloud Infrastructure: Enabling New Possibilities Together

Azure Storage Box:
I have used the Azure Import/Export service a couple of times now and it is not that user friendly, also most of the time you need a lot of USB drives to get the job done. With the Azure Storage Box you can hire a 100TB storage system to backup all the data with SMB/Cifs with several supported backup or storage solutions like Veeam, Commvault, Veritas, Netapp and more to the disk, it’s encrypted and a very robust storage device which reduces the change of busted/broken disks.

Azure Policies:
With Azure Policies you can enable Just in Time Access to Azure Iaas VMs. For example you can set a policy that RDP is by default closed on a number or all VM’s. So you cannot connect to the server under normal conditions. I picked RDP but i could also be SSH or any other port. But in case you need it you can request access through Azure Security Center. Based on a set of rules that access is granted for a specific time.

Discover What’s new with Windows Server Management Experience
This sessions was all about Project Honolulu the new server management experience. I did not have the change yet to try it out myself, but what was demonstrated (video not live) looked very promising in regard to speed, looks and usability. There is also an option to write your own extensions to add more features to the interface. Right now there are extensions for Windows Updates, Fail over Cluster Manager, Storage Spaces Direct and Virtual Machine Manager were as for VMM the new Honolulu interface is added to the VMM menu context to open specific parts in the Honolulu interface. In the current months there are working on adding the new Remote desktop interface (RDMi), but also integrate powershell, remote desktop, Azure Backup and more.

Honolulu can be deployed in several topologies were Manage from Anywhere is one of them. Here you are able to access your entire environment from the internet to.

Some other quick features:

No AD dependency

No Agent’s

No IIS it has it’s own light weight HTTPS interface

No SQL is required

But it’s not clear yet on if this is just for SMB or also enterprises because there is no agent and caching it could get very nasty performance wise if you add 1000 or more servers to it. If it ends up a decent management tool that you could use for almost all your management tasks this might work, otherwise it’s just another management interface on top of all the others but just smoother looking.

That’s it for day 1, not sure if i have the time to create a blog every day, but i will try and update if possible.

Greetings
Pascal

]]>http://www.the-teqnician.nl/wp/2017/09/ignite-2017-orlando-day-1/feed/0864Third party storage replication software and Server 2016 issue’s with S2Dhttp://www.the-teqnician.nl/wp/2017/07/third-party-storage-replication-software-and-server2016-issue-with-s2d/
http://www.the-teqnician.nl/wp/2017/07/third-party-storage-replication-software-and-server2016-issue-with-s2d/#respondWed, 12 Jul 2017 08:07:42 +0000http://www.the-teqnician.nl/wp/?p=817Read moreThird party storage replication software and Server 2016 issue’s with S2D]]>Hi All, recently I encountered some issue’s with third party software to replicate VMs from an old Windows Server 2012 cluster to a brand new Windows Server 2016 S2D Cluster, were it turned out that the third party software was not fully supported on Windows Server 2016 although they claimed to be. I know you could use Share Nothing live Migration but in this case that was not possible so we had to look for third party software.

In this example I encountered two issue’s when using Storage Spaces Direct and the ReFS file system (which is a hard requirment with S2D) together with the replication software.

Issue 1

So my first issue was with the agent that you install on a Windows Server 2016 Hyper-V host to be able to migrate servers from a source VM to a Hyper-V Storage Space Direct cluster as destination. After a push from the console or a manual installation the agent service would not start. After starting it crashes with a .net error. Well that seems pretty simpel and straight forward… do you really need a blog for that. That’s true but the next issue is not directly noticeable. In the end it turned out that the service could not start because it could not work with ReFS.

Issue 2

After starting of the service failed, the vendors tech support provided a work arround, and since there was a deadline pushing we took the work around and left the agent running and migrated the VMs through the work around but that’s not what this blog is about. After a while I noticed that one of the volumes of the cluster was in redirection mode.

I also checked with powershell because the Failover Cluster Manager is not always giving the actual picture of the situation. But it’s the same there..

Fortunately this volume was still empty so I did some tests. When I failed the volume over to other nodes it was not in redirected mode. But when back on node 13 is when straight to redirection mode. Took it offline and online again but after couple of seconds it ended up in redirection mode again. Remove as CSV and added it again but still the same error. The other volumes were in use so I was not able to test with those volumes but it look like an issue with this host and not the volume.

When running a more extensive command Get-ClusterSharedVolumeState there was some more info.

So Incompatible File System Filter drivers were causing issues! File System Filter drivers are running at the kernel level and are used to monitor, modify, act to or with the file system. In the case of the software from the third party vendor it uses the filters to monitor for changes to replicate. The File System Filter that comes with this software is incompatible with ReFS and therefore causing the volume to stay in redirected mode. See the blog by Elden Christensen for more details.

So after establishing that the agent was messing with the volume I removed the agent (because it was not working anyway) and rebooted the server. When requesting the state of the volumes now with Get-ClusterSharedVolumeState we see no incompatible File System Filter error anymore.

Also the volumes report no redirection mode:

So beware wen installing third party software on agents on your S2D Hosts that the really have ReFS Support and that there File System Filter Drivers is not reported legacy. I am specifically not putting the software vendor or the product name here because I don’t want to bash them because they are delivering a really good solution but they just have an issue with ReFS. Hopefully they come up with a fix for it