HPC, Virtualization and Random Thoughtshttp://blogs.technet.com/b/gmarchetti/default.aspxen-US7.x ProductionDFS in Azurehttp://blogs.technet.com/b/gmarchetti/archive/2015/02/17/dfs-in-azure.aspxTue, 17 Feb 2015 23:17:26 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:53ede63a-a81f-4f61-93b0-3cc75727c337gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3645206http://blogs.technet.com/b/gmarchetti/archive/2015/02/17/dfs-in-azure.aspx#comments<p style="padding:0;margin:0;">Often we want to use Azure&#39;s scalable storage to deploy file servers. For geographically distributed infrastructures, it makes sense to establish a distributed file system (DFS) that spans across regions, both for availability and reach.&nbsp;</p>
<p style="padding:0;margin:0;">Here is a summary of the steps required to do so:</p>
<p></p>
<ol>
<li style="padding-top:0px;padding-right:0px;padding-bottom:0px;margin-top:0px;margin-right:0px;margin-bottom:0px;">Create a vnet in each desired datacenter, ideally with at least 2 subnets (1 for the domain controllers, another for the file servers) + 1 for the gateway</li>
<li>Establish cross-premise connectivity in Azure, for instance as explained in&nbsp;<a href="https://msdn.microsoft.com/en-us/library/azure/dn690122.asp">https://msdn.microsoft.com/en-us/library/azure/dn690122.asp</a></li>
<li>Deploy domain controllers on the relevant subnets in both datacenters. Make sure that the dns option in the vnet configurations points to both domain controllers in the appropriate order (dc1, dc2 on vnet1 - dc2, dc1 on vnet2).<ol>
<li>It is recommended that you assigned a fixed ip to the domain controllers / dns servers. You will need Powershell to do that as explained in<a href="http://blogs.technet.com/controlpanel/blogs/posteditor.aspx/&amp;nbsp;https:/msdn.microsoft.com/en-us/library/azure/dn630228.aspx">&nbsp;https://msdn.microsoft.com/en-us/library/azure/dn630228.aspx</a>.</li>
</ol></li>
<li>Deploy servers on the relevant subnets and join them to the domain.</li>
<li>Add at least a data volume each to the servers to host the file shares. If you are planning to replicate its content using dfs-r, it may make sense to put the volume on a locally redundant storage account.</li>
<li>Log into each server and, in &ldquo;configure local server&rdquo;, add the DFS namespace and DFS replication features.<ol>
<li>Alternatively, you could use Powershell desired configuration to deploy servers with those features enabled. See&nbsp;<a href="http://blogs.msdn.com/b/powershell/archive/2014/08/07/introducing-the-azure-powershell-dsc-desired-state-configuration-extension.aspx">http://blogs.msdn.com/b/powershell/archive/2014/08/07/introducing-the-azure-powershell-dsc-desired-state-configuration-extension.aspx</a>&nbsp;</li>
</ol></li>
<li>In the dfs management console on server 1, create a namespace.<ol>
<li>Select &ldquo;edit settings&rdquo; and assign full access to administrator, r/w to other users</li>
<li>Make sure that it is a domain-based namespace and &ldquo;2008 mode&rdquo; is enabled. Information about namespace roots is then replicated on namespace servers in the domain.</li>
</ol>
<p style="padding:0;margin:0;"></p>
</li>
<li>Add server 2 as a namespace server to the namespace you&#39;ve just created for availability.</li>
<li>Select &ldquo;new folder&rdquo; in the namespace.</li>
<li>Create a new file share hosted on the local data volume (e.g.\\server1\share1). Assign permissions as desired (e.g. admins full access and rest r/w).</li>
<li>Click on the share you&#39;ve just created and select &ldquo;replicate folder&rdquo;.</li>
<li>Add a target for the replication on server2 e.g. \\server2\share1</li>
<li>Create replication group as required, by following prompts.</li>
<li>You may also want to verify that replication works both ways by creating a share on server2 that is replicated to a target on server1.</li>
</ol>
<p>&nbsp;<img height="634" style="margin:5px;" width="1129" alt=" " src="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/Slide1.PNG" /></p>
<p></p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3645206&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzureHow Many Cores for the Job?http://blogs.technet.com/b/gmarchetti/archive/2014/09/29/how-many-cores-for-the-job.aspxTue, 30 Sep 2014 04:39:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:d4549361-1bd4-4cd2-be91-8d35da824e5egmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3638428http://blogs.technet.com/b/gmarchetti/archive/2014/09/29/how-many-cores-for-the-job.aspx#comments<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">That is quite a common question. Experienced systems engineers have accumulated knowledge over the years that they distill into a few rules of thumb, e.g.:</span></p>
<ul>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Given a certain hardware configuration, for software package A with an input size of X, on average you&#39;ll need Y cores.</span></li>
</ul>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">On Azure, the hardware configuration is known and you can deploy as many instances as your subscription allows. It would be useful to capture that knowledge in order to optimize our resource utilization. For instance, we can pass that estimated size to our node manager / autoscaler (e.g. in GeRes, HPC Pack or similar batch scheduler service), let it deploy the cluster and then automatically resize as required if the job turns our to be larger or smaller than estimated.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Azure Machine Learning can be used to provide that initial estimate on the basis of historical utilization data. We&#39;ll build an</span><span style="font-family:arial, helvetica, sans-serif;font-size:small;">&nbsp;Azure ML service, submit a cue to it (e.g. the type of job to run) and get an estimate in return. The process to set it up can be summarized as follows:</span></p>
<ol>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Collect a representative data sample and save it in CSV format, e.g. {Type of Job, Input Size, N. of Cores required}.</span></li>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Build a workflow in Azure ML to train and score an appropriate regression model (several are on offer, e.g. linear, Bayesian, neural networks).</span></li>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Evaluate the regression models.</span></li>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Publish one (or more) as Azure ML service.</span></li>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Use the ML API to pass the cue and retrieve the estimate.</span></li>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Use the estimate to deploy the n. of nodes suggested, then let the autoscaler correct as required until the job ends</span></li>
<li><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Collect the usage data and feed back into the training workflow at the next opportunity.</span></li>
</ol>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Here is an example of such workflow in ML Studio:</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/evaluation-workflow.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/evaluation-workflow.png" border="0" alt=" " /></a></span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">It starts with a simple CSV file containing {Type of Job, Size of Input, N. of Cores Required}.&nbsp;</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">The &quot;project columns&quot; and &quot;metadata editor&quot; blocks are used to select the columns of interest (or remove empty ones), then change the data description. The job type in our case is a numeric value, but we really want to treat it as a category not to be computed, hence the change in metadata.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">We split the data into two parts: one will be used to train the regression algorithm, another to score it, i.e. evaluate the efficacy of its predictions against a set of observed inputs / outputs.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">We then initialize two regression models: a linear one and a neural network. This is not strictly necessary in this case. It is just a way to illustrate how you can test several models and then choose the one that provides better predictions. In order to train the models, we supply them with the same portion of the data.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">The next step is to score the models, i.e. use them to predict the required n. of cores for the remaining input set, then compare the results against the sampled data. </span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Here is an example of the scored results for linear regression:</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/scored-_2D00_-linear-regression.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/scored-_2D00_-linear-regression.png" border="0" alt=" " /></a></span></p>
<p></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">... and for neural networks:</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/scored-_2D00_-neural-networks.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/scored-_2D00_-neural-networks.png" border="0" alt=" " /></a></span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Just by looking at the scatter plot we can see that the neural network is more accurate than simple linear regression.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">The evaluation module confirms our finding (the first row in the table is for linear regression, the second one for neural networks):</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/evaluation-results.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/evaluation-results.png" border="0" alt=" " /></a></span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Now that we have established which technique fits the problem better, we&#39;ll save the trained model into our library and publish it as a service.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Click on the &quot;Train Model&quot; block output and select &quot;Save as Trained Model&quot;:</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/save-as-trained-model.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/save-as-trained-model.png" border="0" alt=" " /></a></span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">With our trained model, we build a simple new workflow.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/trained-model.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/trained-model.png" border="0" alt=" " /></a></span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">The &quot;Score Model&quot; block is the one that will provide us with estimates given previous history, so we publish its input and output ports as i/o ports for the prediction service.</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;"><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/input-and-output.png"><img src="http://blogs.technet.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03/input-and-output.png" border="0" alt=" " /></a></span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Finally we run the model again, check its results and if satisfied we publish it as a service.&nbsp;</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">In the process of publishing, an API key will be generated that we&#39;ll use to access the Azure ML REST API.&nbsp;</span></p>
<p><span style="font-family:arial, helvetica, sans-serif;font-size:small;">Here&#39;s some simple Python code to submit a cue and receive an estimate - in our case we submit a job type and receive an estimated number of cores:</span></p>
<p></p>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">import urllib2</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">import json&nbsp;</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">import ast</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">import sys</span></div>
<div></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">jobtype = sys.argv[1]</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">print &quot;Job Type: &quot;,jobtype</span></div>
<div></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">data = &nbsp;{</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;Id&quot;: &quot;score00001&quot;,</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;Instance&quot;: {</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;FeatureVector&quot;: {</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;Type of Job&quot;: jobtype,</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;Cores Required&quot;: &quot;0&quot;,</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; },</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &quot;GlobalParameters&quot;: {</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp; &nbsp; &nbsp; &nbsp; }</span></div>
<div></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">body = str.encode(json.dumps(data))</span></div>
<div></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">url = &#39;https://ussouthcentral.services.azureml.net/workspaces/be4da4afb11b4eeeac49eb4067e7a8e8/services/68bf2617df6c473595dcc87f66f62346/score&#39;</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">api_key = &#39;your api key&#39; # Replace this with the API key for the web service</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">headers = {&#39;Content-Type&#39;:&#39;application/json&#39;, &#39;Authorization&#39;:(&#39;Bearer &#39;+ api_key)}</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">req = urllib2.Request(url, body, headers)&nbsp;</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">response = urllib2.urlopen(req)</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">result = response.read()</span></div>
<div></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">s1=ast.literal_eval(result)</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">predicted = int(round(float(s1[2])))</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">print &quot;Predicted n. of cores: &quot;,predicted</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;">&nbsp;</span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;"><span style="font-family:arial, helvetica, sans-serif;">&nbsp;</span></span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;"><span style="font-family:arial, helvetica, sans-serif;">Add your own code liberally to interface your autoscaler (e.g. GeReS).</span><br /></span></div>
<div><span style="font-family:&#39;courier new&#39;, courier;font-size:small;"><span style="font-family:arial, helvetica, sans-serif;">&nbsp;</span></span></div><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3638428&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzureHPCGeneric Resource Scheduler for Azurehttp://blogs.technet.com/b/gmarchetti/archive/2014/01/14/generic-resource-scheduler-for-azure.aspxWed, 15 Jan 2014 04:27:22 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:e4784569-0cbf-468b-9988-91c288696afcgmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3620578http://blogs.technet.com/b/gmarchetti/archive/2014/01/14/generic-resource-scheduler-for-azure.aspx#comments<p>GeReS (Generic Resource Scheduler) for Windows Azure is now available as a beta release on <a href="http://geres.codeplex.com" title="GeReS">Codeplex</a>.</p>
<p>It is a simple batch job manager written in C# (or Python for the older version).</p>
<p>Geres provides:</p>
<ul>
<li>Command line utilities (e.g. qsub, qlist, jobcancel, joblist) to queue tasks for computation, check on their status, cancel them.</li>
<li>3 task queues (highq, mediumq, lowq) in order of priority.</li>
<li>An agent to install on Azure VMs. It will pick tasks off the queues in that order and spawn the required processes.&nbsp;</li>
<li>A simple notifier application that monitors the status of tasks as reported by the agents.</li>
<li>An autoscaler, running as an extra-small PaaS worker role, which will deploy or remove worker VMs of the desired size based on waiting time for jobs in the queues.</li>
</ul>
<p>The agent will run as many tasks on a node as there are cores. If a task is marked &quot;exclusive&quot; at submission, it will be the only one to run. This is useful for those applications that consume most of the VM resources.</p>
<p>The agent is also responsible to update the status of the spawned processes: running, failed, completed or cancelled. When a node is idle for longer than a pre-configured time, the agent will queue it for removal.</p>
<p>The autoscaler only deploys or removes worker nodes - it does not keep track of task status.</p>
<p>Note that this distributed architecture has the advantage of being highly resilient when compared to a traditional &quot;head node&quot; running the scheduler and apportioning work.</p>
<p>The nodes can fail at any time and the incomplete jobs will pop back into the queues for other nodes to pick up.</p>
<p>The autoscaler can fail at any time and simply be restarted without affecting computations in process.</p>
<p>Azure storage tables and queues are used to handle and keep track of tasks. Service bus topics are used for notifications and commands. Geres benefits from their built-in redundancy and resilience.</p>
<p>For further details on the architecture, please read the release notes on Codeplex.</p>
<p>This short video will show you a typical usage scenario.</p>
<p>(Please visit the site to view this video)&nbsp;</p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3620578&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzureHPCPython and Azure VMshttp://blogs.technet.com/b/gmarchetti/archive/2013/06/13/python-and-azure-vms.aspxFri, 14 Jun 2013 05:06:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:1c0853f2-5e34-4052-83ff-09c5e2e66c37gmarchetti1http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3578819http://blogs.technet.com/b/gmarchetti/archive/2013/06/13/python-and-azure-vms.aspx#comments<p>I have been experimenting lately with the Azure SDK for Python, in particular with the service management API. I found some points in the documentation that were unclear to me, so I am posting what I discovered in the process.</p>
<p>If you want to create a virtual machine, you need to:</p>
<ol>
<li>Create a cloud service of which the VM will be an instance</li>
<li>Select a vhd image for the machine, either from Microsoft catalog or one you created.</li>
<li>Indicate which blob in which storage account will store the vhd file of the VM</li>
<li>Create a configuration set containing machine metadata</li>
<li>Finally, deploy the VM</li>
</ol>
<p>Note that you'll need a management certificate to interface with the Azure Service Management Service. The documentation suggests using openssl to create one, which is fine on Linux and Mac, but not on Windows. On Windows, use makecert instead:</p>
<p style="font-family: Consolas, Courier, monospace; font-size: 14px; white-space: pre-wrap; word-wrap: normal; padding: 5px; margin: 0px; font-style: normal; font-weight: normal; overflow: auto; color: #000000; font-variant: normal; letter-spacing: normal; line-height: 21px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff;"><span style="font-family: 'courier new', courier; font-size: small;">makecert -sky exchange -r -n "CN=&lt;CertificateName&gt;" -pe -a sha1 -len 2048 -ss My "&lt;CertificateName&gt;.cer"</span></p>
<p>This will put the certificate into the current user's personal store and generate a .cer file for you to upload to Azure.</p>
<p>Also note that there is a bug in the Python SDK which affects the creation of endpoints. See&nbsp;<a href="https://github.com/WindowsAzure/azure-sdk-for-python/pull/83">https://github.com/WindowsAzure/azure-sdk-for-python/pull/83</a>&nbsp;for a description.</p>
<p>You will need to edit the init.py file as described on github.</p>
<p>You can do the rest in python:</p>
<p><span style="font-family: Courier; font-size: 10pt;">from azure import *<br />from azure.servicemanagement import *<br /><br />subscription_id = '&lt;enter your subscription id&gt;'<br /># The certificate comes from the local store, not from a .pem file on Windows</span></p>
<p><span style="font-family: Courier; font-size: 10pt;">certificate_path ="CURRENT_USER\\my\\&lt;enter your certificate name&gt;"</span></p>
<p><span style="font-size: 10pt;"><span style="font-family: Courier;"><br /># instantiate a service management service<br />sms = ServiceManagementService(subscription_id, certificate_path)</span></span></p>
<p><span style="font-family: Courier; font-size: 10pt;">#provide a service name and location</span></p>
<p><span style="font-family: Courier; font-size: 10pt;">name = '&lt;name of service you want&gt;'<br />location = 'West US'<br /><br /># You can either set the location or an affinity_group<br />sms.create_hosted_service(service_name=name, label=name,location=location)<br /><br /># Name of an os image as returned by list_os_images (catalog and yours)<br />image_name = '&lt;your image name&gt;.vhd'<br /><br /># Destination url://storage account/container/blob where the VM disk will be created<br />media_link = 'http://&lt;account name&gt;.blob.core.windows.net/vhds/'+name+'.vhd'<br /><br /># The documentation shows a Linux VM. Windows is more complicated.<br /># WindowsConfigurationSet contains metadata for a Windows VM<br />windows_config = WindowsConfigurationSet(&lt;machine name&gt;, '&lt;admin password&gt;')<br /># by default the api will look for domain credentials. if you want no domain:</span></p>
<p><span style="font-family: Courier; font-size: 10pt;">windows_config.domain_join = None<br /><br /># Here's the hard disk for the os</span></p>
<p><span style="font-family: Courier; font-size: 10pt;">os_hd = OSVirtualHardDisk(image_name, media_link)<br /><br /># Unless you specify endpoints, you won't be able to connect to the VM.</span></p>
<p><span style="font-family: Courier; font-size: 10pt;"># The documentation is unclear on the matter.</span></p>
<p><span style="font-family: Courier; font-size: 10pt;">endpoint_config = ConfigurationSet()<br />endpoint_config.configuration_set_type = 'NetworkConfiguration'<br /><br />endpoint1 = ConfigurationSetInputEndpoint(name = 'rdp', protocol = 'tcp', port = '33890', local_port = '3389', load_balanced_endpoint_set_name = None, enable_direct_server_return = False)<br />endpoint2 = ConfigurationSetInputEndpoint(name = 'web', protocol = 'tcp', port = '8080', local_port = '80', load_balanced_endpoint_set_name = None, enable_direct_server_return = False)<br /><br />#endpoints must be specified as elements in a list</span></p>
<p><span style="font-family: Courier; font-size: 10pt;">endpoint_config.input_endpoints.input_endpoints.append(endpoint1)<br />endpoint_config.input_endpoints.input_endpoints.append(endpoint2)<br /><br />#Finally you can deploy the VM</span></p>
<p><span style="font-family: Courier; font-size: 10pt;"><br />sms.create_virtual_machine_deployment(service_name=name,<br /> deployment_name=name,<br /> deployment_slot='production',<br /> label=name,<br /> role_name=name,<br /> system_config=windows_config,<br /> network_config = endpoint_config,<br /> os_virtual_hard_disk=os_hd,<br /> role_size='Small')</span></p>
<p style="color: #008; text-align: right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3578819&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Using CloudBlitz to Submit Jobshttp://blogs.technet.com/b/gmarchetti/archive/2013/05/16/using-cloudblitz-to-submit-jobs.aspxThu, 16 May 2013 21:22:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:227fcec2-a47c-46c5-af68-665ec7348246gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3573234http://blogs.technet.com/b/gmarchetti/archive/2013/05/16/using-cloudblitz-to-submit-jobs.aspx#comments<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">In the second post about Cloudblitz (<a href="http://cloudblitz.codeplex.com">http://cloudblitz.codeplex.com</a>) we'll examine how to use it to submit jobs to&nbsp;a deployed Azure cluster.</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">Keep in mind that in Microsoft's HPC scheduler implementation, jobs are just containers. They contain one or more tasks to be executed and have properties associated to them (e.g. credentials, priority).</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">Such a container can be created empty, then filled with tasks and finally submitted for execution. If you have a look at the Create-InstallJob.ps1 script that comes with Cloudblitz, you'll see an example of that.</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">Let us suppose that you want to&nbsp;write a simple job submission script:</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">1.&nbsp;Import the Cloudblitz.Powershell module and get the hpc&nbsp;object types:</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;">Import-MyModule -Name "CloudBlitz.Powershell" -Path $cmdletsPath<br />$hpcTypesPath = [System.IO.Path]::GetDirectoryName($cmdletsPath) + "\HpcSchedulerManagement.dll"<br />Add-Type -Path $hpcTypesPath -ErrorAction Stop</span></p>
<p><span style="color: #000000; font-family: arial,helvetica,sans-serif; font-size: small;">2. Provide the credentials for the jobs to be submitted</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;"># Fill in hpc scheduler credentials<br />&nbsp;$hpcCredentials = New-Object CloudBlitz.Powershell.Cmdlets.HpcSchedulerCredentials<br />&nbsp;$hpcCredentials.ClusterName = $ClusterName<br />&nbsp;$hpcCredentials.AdminLogin = $AdminLogin<br />&nbsp;$hpcCredentials.AdminPassword = $AdminPassword<br />&nbsp;$hpcCredentials.CertificateThumbprint = $mgmtcert.Thumbprint</span></p>
<p><span style="color: #000000; font-family: arial,helvetica,sans-serif; font-size: small;">Note that you&nbsp;will need&nbsp;the thumbprint of the certificate associated with the service at deployment. That is required because the powershell commandlets use&nbsp;the hpc scheduler&nbsp;REST API to communicate with the service.&nbsp;All the parameters can be passed on the command line&nbsp;or retrieved from the ConfigurationVariables.ps1 file that you edited before deployment.&nbsp;</span></p>
<p><span style="color: #000000; font-family: arial,helvetica,sans-serif; font-size: small;">3. Create an empty job container</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;"># Create the job on the scheduler<br />&nbsp;$Job = New-Object HpcSchedulerManagement.DataContracts.SchedulerJob<br />&nbsp;$Job.FailOnTaskFailure = $True<br />&nbsp;$Job.Name = "Job"+$(get-random 1000000)<br />&nbsp;$Job.Id = Add-SchedulerJob -Credentials $hpcCredentials -Job $Job -ErrorAction Stop</span></p>
<p><span style="color: #000000; font-family: arial,helvetica,sans-serif; font-size: small;">The first line creates an empty container, then we set some properties (e.g. random name) and finally we create the job on the hpc scheduler and retrieve its id.</span></p>
<p><span style="color: #000000; font-family: arial,helvetica,sans-serif; font-size: small;">4. Add tasks into the job container</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;"># Create the task<br />&nbsp;$Task = New-Object HpcSchedulerManagement.DataContracts.SchedulerTask<br />&nbsp;$Task.Name = "Task"+$(Get-Random 1000000)<br />&nbsp;$Task.CommandLine = $CommandLine<br />&nbsp;$Task.Id = Add-SchedulerTask -Credentials $hpcCredentials -JobId $Job.Id -Task $Task</span></p>
<p><span style="color: #000000; font-family: arial,helvetica,sans-serif; font-size: small;">As with the job, we create an empty object, set its properties, then add it to the job container whose id we retrieved before. Note that one of those properties is $Task.CommandLine, which takes a string. The string is an arbitrary command to be executed on the remote nodes. If you want to start a MPI job, it will look like</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;">mpiexec &lt;mpi parameters&gt; &lt;your mpi&gt;.exe &lt;app parameters&gt;</span></p>
<p><br /><span style="font-family: arial,helvetica,sans-serif; font-size: small;">5. Submit the job for execution</span></p>
<p><span style="font-family: courier new,courier; font-size: small;"># Submit the job</span><br /><span style="font-family: courier new,courier; font-size: small;">&nbsp; Write-Host "Submitting the job"</span><br /><span style="font-family: courier new,courier; font-size: small;">&nbsp; Submit-SchedulerJob -Credentials $hpcCredentials -JobId $Job.Id</span></p>
<p><span style="font-family: courier new,courier; font-size: small;"># Wait for the job to complete</span><br /><span style="font-family: courier new,courier; font-size: small;">&nbsp; Write-Host "Waiting for job $($Job.Id) to complete"</span><br /><span style="font-family: courier new,courier; font-size: small;">&nbsp; $finalState = Wait-SchedulerJob -Credentials $hpcCredentials -JobId $Job.Id</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">&nbsp;&nbsp;Write-Host "Job $finalState"</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">Note that you need not wait for the job to complete. Those lines just show a way to retrieve and display the final job state. Normally, you will submit, then query for status later with something like:</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;">Get-schedulerjob -Credentials $hpcCredentials -JobId $JobId </span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">That line will retrieve the job object and also display all its properties (not just those set in these samples).</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">You can retrieve the whole list of jobs in the scheduler with:</span></p>
<p><span style="color: #000000; font-family: courier new,courier; font-size: small;">Get-schedulerjobList -Credentials $hpcCredentials </span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">The job properties are those supported by&nbsp;Microsoft's HPC scheduler (<a href="http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob_properties(v=vs.85).aspx">http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob_properties(v=vs.85).aspx)</a>.</span></p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">Let's see an example of those, retrieved using Get-SchedulerJob:</span><span style="color: #f5f5f5; font-family: Lucida Console; font-size: xx-small;"><span style="color: #f5f5f5; font-family: Lucida Console; font-size: xx-small;"><span style="color: #f5f5f5; font-family: Lucida Console; font-size: xx-small;"><span style="font-family: arial,helvetica,sans-serif; font-size: small;">d&nbsp;&nbsp;&nbsp;</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 8</span></span></span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Job641676</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">UserName&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : topguy</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Project&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">RuntimeSeconds&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MinCores&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MaxCores&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MinSockets&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MaxSockets&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MinNodes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MaxNodes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">UnitType&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Core</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">RequestedNodes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">IsExclusive&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : False</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">RunUntilCanceled&nbsp;&nbsp;&nbsp; : False</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">NodeGroups&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">FailOnTaskFailure&nbsp;&nbsp; : True</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">AutoCalculateMax&nbsp;&nbsp;&nbsp; : True</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">AutoCalculateMin&nbsp;&nbsp;&nbsp; : True</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Preemptable&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : True</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MinMemory&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MaxMemory&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MinCoresPerNode&nbsp;&nbsp;&nbsp;&nbsp; : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">MaxCoresPerNode&nbsp;&nbsp;&nbsp;&nbsp; : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">SoftwareLicense&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">OrderBy&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">ClientSource&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : AzureRestServiceAgent</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Progress&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 100</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">ProgressMessage&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">TargetResourceCount : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">ExpandedPriority&nbsp;&nbsp;&nbsp; : 2000</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">ServiceName&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">NotifyOnStart&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : False</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">NotifyOnCompletion&nbsp; : False</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">EmailAddress&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Priority&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Normal</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Password&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">JobTemplate&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Default</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">JobType&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">RequeueCount&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">AutoRequeueCount&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">RequestCancel&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">Owner&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : topguy</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">SubmitTime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 5/15/2013 11:20:47 PM</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">CreateTime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 5/15/2013 11:20:46 PM</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">StartTime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 5/15/2013 11:20:48 PM</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">EndTime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 5/15/2013 11:20:50 PM</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">State&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Finished</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">HoldUntil&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">HasRuntime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : False</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">CanGrow&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : True</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">CanShrink&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : True</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">PreviousState&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Running</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">ChangeTime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 5/15/2013 11:20:50 PM</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">ExcludedNodes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :</span></p>
<p><span style="font-family: courier new,courier; font-size: small;">AllocatedNodes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :&nbsp;</span></p>
<p>&nbsp;</p>
<p><span style="font-family: arial,helvetica,sans-serif; font-size: small;">You will typically want&nbsp;to set at job submission&nbsp;MinCores and MaxCores (min. and max n. of cores required, default is 1). Note that properties are inherited by the tasks, unless overridden by those set at the task level (<a href="http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulertask_properties(v=vs.85).aspx">http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulertask_properties(v=vs.85).aspx)</a>. For MPI jobs the&nbsp;number of cores is fixed&nbsp;at submission, so MinCores=MaxCores.</span></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3573234&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Cloudblitz: A Tool to Deploy HPC Clusters in Azurehttp://blogs.technet.com/b/gmarchetti/archive/2013/05/13/cloudblitz-a-tool-to-deploy-hpc-clusters-in-azure.aspxMon, 13 May 2013 18:31:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:dcfabc81-d5cb-4f99-a928-c666c592f6e8gmarchetti1http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3572395http://blogs.technet.com/b/gmarchetti/archive/2013/05/13/cloudblitz-a-tool-to-deploy-hpc-clusters-in-azure.aspx#comments<p>I'm happy to announce the availability of Cloudblitz, a tool (framework) to deploy HPC clusters in Azure programmatically via Powershell, then submit&nbsp;jobs to them&nbsp;via Powershell as well.</p>
<p>You can find it at <a href="http://cloudblitz.codeplex.com">http://cloudblitz.codeplex.com</a></p>
<p>The tool builds on the HPC scheduler for Azure and the Powershell commandlets for Azure to solve a common problem, i.e. provide a scriptable way to create and interact with an Azure PaaS cluster. I refer you to the Codeplex page for the complete description and documentation. In this blog post, I'll cover how to build it and use it.</p>
<p>1. Make sure that you download and install all the dependencies listed on the codeplex page.</p>
<p>2. Download and extract the source code in its own directory.</p>
<p>3. Open a Visual Studio 2012 command prompt as administrator (dear old cmd.exe)</p>
<p>4. Go to the "BuildScripts" directory and run <em>BuildCloudBlitz.</em><em>cmd.</em> This will build the required modules and copy the results in a "Packages" directory.</p>
<p>5. Open a Windows Azure Powershell prompt as administrator. Go to the CloudBlitz.ClusterTemplate directory</p>
<p>6. Edit the ServiceDefinition.csdef file (it is just an XML file). In particular note:</p>
<p><em>&lt;WorkerRole name="ComputeNode" vmsize="Small"&gt;</em></p>
<p><em>...</em></p>
<p><em>&lt;WebRole name="HeadNode" vmsize="Small"&gt;</em></p>
<p>Choose your VM size and type it in. In general, a small head node is sufficient (the head node does NOT run SQL, rather it uses SQL Azure,&nbsp;which&nbsp;reduces its memory requirements). If you're planning to run thousands of tasks at a time, go for a large instance.</p>
<p>7. Go to the CloudBlitz.Powershell\Scripts directory</p>
<p>8. Edit the ConfigurationVariables.ps1 file. In particular note:</p>
<p><em>$subscriptionName = "&lt;your subscription name&gt;"</em><br /><em>$subscriptionId = "&lt;your subscription id&gt;"</em><br /><em>$administratorName = "&lt;admin name for the hpc cluster&gt;"</em><br /><em>$systemPassword = "&lt;admin password for hpc cluster&gt;"</em><br /><em>$certificateFileNameRoot="&lt;name of certificate&gt;"</em><br /><em>$deploymentName="&lt;name of service&gt;"</em><br /><em>$deployLocation="&lt;data center location&gt;"</em><br /><em>$HeadNodeInstanceCount = 1</em><br /><em>$ComputeNodeInstanceCount = &lt;n. of desired compute nodes&gt;</em><br /><em>$removePreviousDeploymentAndService = $true</em></p>
<p>9. Run <em>InstallAndExportCertificate.</em>ps1 This will create a .cer and a .pfx self-signed certificate file, install it in the local user certificate store.</p>
<p>10. Via the Azure portal, upload the .cer file as a management certificate.</p>
<p>11. Run the <em>DeployDeployerRole.ps1</em> script. It will upload the packages, create an azure worker role that deploys a hpc cluster as per configuration, then remove the deployer role.&nbsp;Such a temporary role is necessary to avoid any firewall and connectivity issues. Most people will have ports 80 and 443 open for http and https. The&nbsp;Azure HPC scheduler&nbsp;deployment also requires a connection to SQL Azure on port 1433, which is normally closed. By letting the temporary role&nbsp;manage the process, such connections are all internal to Azure.</p>
<p>You can now connect to your cluster.cloudapp.net via RDP and start submitting jobs in the "traditional" way via the job manager or the command line.</p>
<p>Alternatively, you can use&nbsp;the set of powershell commandlets we developed&nbsp;from your workstation. They will be the subject of the next blog post :-)</p>
<p>&nbsp;&nbsp;</p>
<p></p>
<p><em></em></p>
<p></p>
<p>&nbsp;</p>
<p>&nbsp;</p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3572395&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Endpoints, firewalls and other annoyanceshttp://blogs.technet.com/b/gmarchetti/archive/2012/01/05/endpoints-firewalls-and-other-annoyances.aspxThu, 05 Jan 2012 22:22:49 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:3cba2e00-4179-406d-a5dc-7fd59e59dba4gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3474154http://blogs.technet.com/b/gmarchetti/archive/2012/01/05/endpoints-firewalls-and-other-annoyances.aspx#comments<p>When you deploy a HPC cluster on Azure, you typically want to run some application in it besides those provided in the azure samples. Those applications may require their own ports to be opened on the internal network and endpoints to be established for both internal and external communication. There is no hpc cluster manager gui in a pure azure implementation, so you'll have to perform such configuration manually.</p>
<p><strong>1. Enabling endpoints</strong></p>
<p>Endpoints must be configured prior to deployment, in the Azure service definition file. For instance, if you want to allow file sharing between nodes, you'll need SMB endpoints.</p>
<p><em><span style="font-family: Courier; font-size: 12pt">&lt;Endpoints&gt;<br /> &lt;InternalEndpoint name=&quot;SMB&quot; protocol=&quot;tcp&quot; port=&quot;445&quot; /&gt;<br /> &lt;InternalEndpoint name=&quot;Netbios&quot; protocol=&quot;tcp&quot; port=&quot;139&quot; /&gt;<br />&lt;/Endpoints&gt;</span></em></p>
<p>Note that the default configuration for azure hpc samples enables port 445 but not 139. That can be corrected via Visual Studio or by manually editing the .csdef file.</p>
<p><img height="263" width="712" style="margin: 5px" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/4186.Windows-XP-Professional.jpg" /></p>
<p>If you define a public endpoint too (as in the picture), remember that it will be subject to the azure load balancers and no &quot;sticky&quot; sessions are supported.</p>
<p>After deployment, you'll be able to share files and directories as usual. If you use a directory on an azure drive mounted as per my previous post, you will have a form of shared permanent storage for computation inputs and results that will persist between deployments.</p>
<p> <br /><strong>2. Opening ports on the nodes</strong></p>
<p>Establishing endpoints may not be sufficient. You will also need to open the relevant ports on the o/s firewalls of the nodes in question. For instance, the command line below will open ports 55000-55500:</p>
<p><span style="font-family:Courier">clusrun /all netsh advfirewall firewall add rule name=&quot;myapp&quot; dir=in protocol=tcp localport=55000-55500 action=allow</span></p>
<p>You may also need to register an exception with the windows firewall for the executables that you plan to run and any other executables that they may invoke in turn:</p>
<p><span style="font-family:Courier">clusrun /all hpcfwutil register myapp.exe &quot;d:\Program Files\myapp\myapp.exe&quot;</span></p>
<p>At this point you can submit jobs via the portal or the command line, e.g.</p>
<p><span style="font-family:Courier">job submit /numcores:16 /stdout:\\headnode1\share\myapp.out /stderr:\\headnode1\share\myapp.err mpiexec myapp.exe -&lt;parameters&gt;</span></p>
<p>The command line above submits an mpi job requiring 16 cores to run myapp.exe (assuming it is installed on the nodes), redirects outputs and errors to files on a share on the headnode.</p>
<p><strong>3. Observations</strong></p>
<p>Note that in the azure hpc sample deployment, the headnode is always called<span style="font-family:Courier"> headnode1</span> and the compute nodes <span style="font-family:Courier">computenode1, 2, 3,</span> etc... This has no relationship with the actual host names, which are assigned randomly by the azure fabric controller at deployment and can change. Also, computenode1 is not necessarily the first instance of the computenode role listed on the azure portal.</p>
<p><img height="306" style="margin: 5px" width="435" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/6560.Management-Portal-_1320_-Windows-Azure-Platform.jpg" /></p>
<p>A list of aliases is maintained automatically in %WINDIR%\system32\drivers\etc\hosts</p>
<p>I have noticed however that for all mpi jobs I have submitted, mpi rank 0 was always on computenode1. That may be a sheer coincidence, as there is no reason for it to be there that I can think of. I'd be curious to know whether you find the same situation or not.</p>
<p>This may be useful anyway, because certain applications may require a connection to rank 0 for their GUI, in order to display the status of the running computation.</p>
<p></p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3474154&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzureHPCGetting data and applications to / from Azurehttp://blogs.technet.com/b/gmarchetti/archive/2011/12/13/getting-data-and-applications-to-from-azure.aspxTue, 13 Dec 2011 18:08:11 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:0770def8-d341-48c0-8155-5dcb491bac93gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3470638http://blogs.technet.com/b/gmarchetti/archive/2011/12/13/getting-data-and-applications-to-from-azure.aspx#comments<p><font size="3">After <a href="http://blogs.msdn.com/b/hpctrekker/archive/2011/12/06/deploying-an-hpc-cluster-using-just-powershell-part-i.aspx">Wenming’s</a> post on deploying an azure hpc cluster with powershell, I have been looking for an easy way to transfer applications &amp; data to / from such cluster.</font></p> <p><font size="3">I have found that in HPC SP3 the hpcpack command includes the option to upload and mount a VHD drive on azure nodes. The command is also available with the free client utilities. That is very useful because:</font></p> <p><font size="3">- You can put whatever you like in the vhd file, including the executable that you need to run.</font></p> <p><font size="3">- It is stored in blob storage, so it is permanent. If you need to delete the cluster deployment in order to save money, the data in the vhd drive will stay.</font></p> <p><font size="3">- The vhd drive can be snapshotted and the snapshots mounted read-only on multiple nodes.</font></p> <p><font size="3">- You can download the vhd back to your PC when you’re done and mount it.</font></p> <p><font size="3">Here’s a summary of the process:</font></p> <p><font size="3">1. On your PC, create and attach a <em>fixed-size </em>vhd drive with the storage management tools (GUI or diskpart).</font></p> <p><a href="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/1200.image_5F00_207FC044.png"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/1307.image_5F00_thumb_5F00_26C696D2.png" width="496" height="297" /></a>&#160;</p> <p><font size="3">2. Copy your files to the drive.</font></p> <p><font size="3">3. Again with storage manager gui or diskpart, detach the vhd.</font></p> <p><font size="3">4. Use hpcpack (or a gui tool like <a href="http://www.cloudberrylab.com/free-microsoft-azure-explorer.aspx">cloudberry explorer</a>) to upload the vhd file to blob storage</font></p> <p><font size="3">- <em>hpcpack upload &lt;file&gt;.vhd /account:&lt;storage account name&gt; /key:&lt;storage key&gt; /container:&lt;container name&gt; /description:&lt;description string&gt;</em></font></p> <p><font size="3">5. Connect via rdp to the headnode in azure (or any other node where you want to mount the drive).</font></p> <p><font size="3">6. To mount read-write on one node, run</font></p> <p><font size="3">- <em>hpcpack mount &lt;file&gt;.vhd /account:&lt;storage account name&gt; /key:&lt;storage key&gt; /container:&lt;container name&gt;</em></font></p> <p><font size="3">7. To mount read-only, add the /snapshot parameter. You can also run the command on specific nodes via job manager or clusrun.</font></p> <p><font size="3">Once the drive is mounted, you can share it out via the GUI or with net share. The smb endpoints and firewall exceptions must be configured on all nodes for that to work. </font></p> <p><font size="3">When you’re done, you can use the same gui tool or hpcpack to download the vhd file back to your pc. The download does not erase the file from blob storage.</font></p> <p><font size="3">- <em>hpcpack download &lt;file&gt;.vhd /account:&lt;storage account name&gt; /key:&lt;storage key&gt; /container:&lt;container name&gt;</em></font></p> <p><font size="3">Mount the vhd drive on your pc and retrieve your results.</font></p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3470638&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">SQL Server in Windows Azurehttp://blogs.technet.com/b/gmarchetti/archive/2011/09/09/sql-server-in-windows-azure.aspxSat, 10 Sep 2011 06:08:36 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:a22fd8f5-b9c2-4612-9e48-7ba7ec110e2fgmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3452281http://blogs.technet.com/b/gmarchetti/archive/2011/09/09/sql-server-in-windows-azure.aspx#comments<p><font size="2">It is certainly possible to run SQL Server 2008 R2 in Azure virtual machines, but keep in mind that they are not persistent between deployments, hence you want to use them for testing only and be aware of potential data loss. </font></p> <p><font size="2">In order to mitigate that risk you may want to set sql up in such a way that logs and data files are stored on Azure drives mounted by the VM – but that’s a topic for another blog post <img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/4846.wlEmoticon_2D00_smile_5F00_194F7A63.png" /></font></p> <p><font size="2">Azure VMs must be uploaded in a sysprepped state (see <a href="http://blogs.technet.com/b/gmarchetti/archive/2011/03/12/put-a-vm-on-azure.aspx">previous post</a>) for their deployment to succeed. Fortunately in SQL 2008R2 there is a way to “package” the application so that it can be installed after the virtual machine deployment. The trick is to automate the completion of such installation. </font></p> <p><font size="2">Here’s an outline of the process to follow:</font></p> <p><font size="2">1. Build a Sever 2008 R2 virtual machine in hyper-v</font></p> <p><font size="2">2. On that virtual machine, prepare a SQL Server 2008 R2 image for deployment with sysprep</font></p> <blockquote> <p><font size="2">a. copy the installation media on a local directory (e.g. c:\sqlinstall)</font></p> <p><font size="2">b. prepare the image as described here: </font><a href="http://msdn.microsoft.com/en-us/library/ee210664.aspx"><font size="2">http://msdn.microsoft.com/en-us/library/ee210664.aspx</font></a></p> <p><font size="2">c. Capture the sql server setup configuration file configurationfile.ini</font></p> </blockquote> <p><font size="2">3. Edit the configuration file to instruct sql setup to complete the installation as required. Keywords are:</font></p> <blockquote> <p><font size="2">a. ACTION=&quot;CompleteImage&quot;</font></p> <p><font size="2">b. QUIET=&quot;True&quot;</font></p> <p><font size="2">c. IACCEPTSQLSERVERLICENSETERMS=&quot;TRUE&quot;</font></p> <p><font size="2">d. SECURITYMODE=&quot;SQL&quot;</font></p> <p><font size="2">e. SAPWD=&quot;&lt;insert your sa password&gt;&quot;</font></p> </blockquote> <p><font size="2">Remember that no active directory exists on Azure, so&#160; you will need to use SQL authentication and run the sql services under local credentials, e.g. “NT AUTHORITY\NETWORK SERVICE”. If you plan to connect to sql server, make sure that tcpip is enabled and the relevant port (1433 by default) is open on the windows firewall.</font></p> <p><font size="2">4. Save the configuration file with a new name, e.g. c:\sqlcompleteconfig.ini</font></p> <p><font size="2">5. Install the Windows Azure components</font></p> <p><font size="2">6. Create a %WINDIR%\Setup\Scripts\SetupComplete.cmd file. This file is executed once after windows setup completes. It can be used to perform additional unattended setup tasks, including installing applications. It runs under the system authority context. The file should contain the commands to complete the sql installation, such as:</font></p> <p><font size="2" face="Courier New">“C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\SQLServer2008R2\ setup.exe&quot; </font></p> <p><font size="2" face="Courier New">/CONFIGURATIONFILE=&quot;C:\SQLCOMPLETECONFIG.INI&quot; </font></p> <p><font size="2" face="Courier New">/installmediapath=&quot;c:\sqlinstall\x64\setup”</font></p> <p><font size="2">7. Sysprep the VM and upload it to Azure as instructed in my <a href="http://blogs.technet.com/b/gmarchetti/archive/2011/03/12/put-a-vm-on-azure.aspx">previous post</a>.</font></p> <p><font size="2">Remember to configure the appropriate ports &amp; endpoints for the VM when creating the package to upload. Enjoy SQL running on Azure <img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://blogs.technet.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-57-03-metablogapi/4846.wlEmoticon_2D00_smile_5F00_194F7A63.png" /></font></p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3452281&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Azure Connect - a vpn between Azure and your machineshttp://blogs.technet.com/b/gmarchetti/archive/2011/04/18/azure-connect-a-vpn-between-azure-and-your-machines.aspxTue, 19 Apr 2011 04:18:37 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:8f7e77d3-d411-4176-8faa-1bcae0c514dcgmarchetti1http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3422631http://blogs.technet.com/b/gmarchetti/archive/2011/04/18/azure-connect-a-vpn-between-azure-and-your-machines.aspx#comments<p>Azure Connect is a service that lets you establish a vpn tunnel (for want of a better word) between a virtual machine running in Azure and another running on premises. The connection is point-to-point, meaning that you will need to configure it on every machine involved. Nonetheless, it is a valuable tool when you need to access resources within your network from Azure and viceversa.</p>
<p>To set up Azure connect, you will need:</p>
<p>1. <strong>An activation token. </strong>Open the management portal. If you are on the Connect beta program, you'll see a &quot;connect&quot; icon on the main ribbon. Click on it and then on your subscription name. Select &quot;Get Activation Token&quot; in the top ribbon. Your token will be displayed. Copy it and save it.</p>
<p><img height="250" style="margin: 5px" width="387" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/1680.Management-Portal-_1320_-Windows-Azure-Platform.jpg" /></p>
<p>2. <strong>A local endpoint</strong>, i.e. a communication service. To install it on the local machine, click on &quot;Install Local Endpoint&quot; in the top ribbon, copy the link provided (note that it contains the token) and use it to open another browser tab. When asked, click &quot;run&quot; to download and install the endpoint. This is a service running as LocalSystem; it uses the HTTPS protocol, so make sure that TCP port 443 outbound is open.</p>
<p>3. <strong>An endpoint group. </strong>This is made of local endpoints and azure roles (including VMs) amongst which connectivity is authorized. </p>
<p><strong>Local machines</strong> will register with Azure Connect after you install the endpoint. </p>
<p><strong>For Azure roles</strong>, you need to modify in Visual Studio:</p>
<p>- <em>The configuration file .cscf</em>g and include the following line under &quot;Settings&quot;</p>
<p>&lt;Setting name=&quot;Microsoft.WindowsAzure.Plugins.Connect.ActivationToken&quot; value=&quot;_token_&quot; /&gt;</p>
<p>where _token_ is the activation token string that you copied in step 1.</p>
<p>- <em>The service definition file .csdef</em> and include the following line under &quot;Imports&quot;</p>
<p>&lt;Import moduleName=&quot;Connect&quot; /&gt;</p>
<p><img height="249" style="margin: 5px" width="593" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/6761.Windows-XP-Professional.jpg" /></p>
<p><img height="187" style="margin: 5px" width="623" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/3718.Windows-XP-Professional.jpg" /></p>
<p>The same can be done via the GUI. Right-click on the role name, select properties and then &quot;Virtual Network&quot;. Select &quot;Activate Windows Azure Connect&quot;, then paste the token in the relevant field.</p>
<p><img height="200" style="margin: 5px" width="684" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/4375.Windows-XP-Professional.jpg" /></p>
<p>When you publish the project to Azure with Visual Studio, Azure Connect will be activated for those roles. </p>
<p><strong>Virtual Machines</strong> are the exception: you have complete control of what the VM contains, no endpoint agent will be installed for you. You must install it in the base image for the vm role before uploading it to Azure. You can obtain the package at </p>
<p><em>http://waconnect.blob.core.windows.net/client/latest/x64/wacendpointpackagefull.exe</em></p>
<p>This is not the same URL that you get by clicking &quot;install local endpoint&quot;. Also make sure that IPv6 is enabled in your template (it is by default).</p>
<p>You must enable Azure Connect in the configuration files as before for the VM role as well. Installing the endpoint package in the template is not sufficient. If you have a running VM where you forgot to install the agent, you can still log into it and perform the installation manually. Alas, this change won't be persisted to the stored template.</p>
<p><strong>Firewall</strong> ports on the local machines and on Azure roles (except VMs) will be configured automatically during the endpoint installation. If you enforce firewall policies, please make sure to:</p>
<p>- allow TCP 443 outbound.</p>
<p>- allow IPv6 protocol, ICMPv6 type 133 and 134 messages (router solicitation and advertisement).</p>
<p>In the end, you will see a list of the activated machines and roles in your management portal.</p>
<p><img height="123" style="margin: 5px" width="765" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/5732.Management-Portal-_1320_-Windows-Azure-Platform.jpg" /></p>
<p>You can then create a group and specify which local machines and Azure roles are allowed to connect.</p>
<p><img height="450" width="665" style="margin: 5px" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/8080.Management-Portal-_1320_-Windows-Azure-Platform.jpg" /></p>
<p>The most common use of Azure virtual networks is domain connectivity. To join an azure role to a corporate domain, you must install a local endpoint on a domain controller that is also a DNS server, then include both role and DC in the same endpoint group (other servers can be included too). In the configuration for the role, you must set specific parameters to enable the domain join. You'll find them in the settings for the role.</p>
<p><img height="227" style="margin: 5px" width="648" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/0284.Windows-XP-Professional.jpg" /></p>
<p></p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3422631&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzureAzure billing clarificationhttp://blogs.technet.com/b/gmarchetti/archive/2011/03/14/azure-billing-clarification.aspxTue, 15 Mar 2011 04:00:33 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:d7150ca2-47b4-4d0b-8413-2439d49da129gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3412798http://blogs.technet.com/b/gmarchetti/archive/2011/03/14/azure-billing-clarification.aspx#comments<p>A quick clarification: Azure allocates resources to your role as long as it is deployed, whether it is running or not. Billing starts when the deployment is complete and finishes when it is deleted. Roles that are deployed will be billed for the deployed hours, whether they are running or not. </p>
<p>In summary, if you are not using a role, remove it!</p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3412798&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzurePut a VM on Azurehttp://blogs.technet.com/b/gmarchetti/archive/2011/03/12/put-a-vm-on-azure.aspxSat, 12 Mar 2011 09:57:17 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:76b0dbb6-5adc-4ada-bf11-28bb51c8d7b0gmarchetti3http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3412242http://blogs.technet.com/b/gmarchetti/archive/2011/03/12/put-a-vm-on-azure.aspx#comments<p>I have summarized here all the steps you need to take in order to deploy an Azure VM. </p>
<p><span style="font-size:16pt"><strong>Step 1: Get your certificates</strong></span></p>
<p>I assume that you have an active Azure subscription and you have installed visual studio 2010, the azure sdk and tools and activated the VM role. You will need a management certificate for your subscription to deploy services and 1 or more service certificates to communicate with those securely. To generate a x509 certificate for use with the management API:</p>
<p>1. Open the IIS manager, click on your server.</p>
<p>2. Select &quot;Server Certificates&quot; in the main panel.</p>
<p>3. Click &quot;Create Self-Signed Certificate&quot; in the actions panel</p>
<p><img height="471" style="margin: 5px" width="631" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/6242.rd.microsoftlab.net.jpg" /></p>
<p>4. Give the certificate a friendly name.</p>
<p>5. Close IIS manager and run certmgr.msc</p>
<p>6. Find your certificate in &quot;Trusted Root Certification Authorities&quot;</p>
<p>7. Right-Click on it, select All Tasks / Export</p>
<p><img height="446" style="margin: 5px" width="637" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/6064.Windows-XP-Professional.jpg" /></p>
<p>8. Do not export the private key, choose the DER format, give it a name.</p>
<p>9. Navigate to the Windows Azure management portal.</p>
<p>10. Select Hosted Services / Management Certificates / Add a Certificate</p>
<p>11. Browse to the management certificate file and upload it.</p>
<p><span style="font-size:16pt"><strong>Step 2: Prepare the VM</strong></span></p>
<p>I assume that you are familiar with Hyper-V and how to build a virtual machine on a hyper-v host.</p>
<ol>
<li>Create a virtual machine on hyper-v. Note that the maximum size of virtual hard disk you specify will determine what size of Azure VM you will be able to choose. An extra-small machine will mount a vhd up to 15 GB, small one up to 35 and medium or more up to 65 GB. This is just the size of the system VHD. You will still receive local storage, mounted as a separate volume.</li>
<li>Install Windows Server 2008 R2 on the VHD. It is the only supported o/s as of writing.</li>
<li>Install the Azure integration components in the VM. They are contained in the wavmroleic.iso file, which is typically located in c:\progam files\windows azure sdk\&lt;version&gt;\iso. You need to mount that file on the VM and then run the automatic installation process. This provisions the device drivers and management services required by the Azure hypervisor and fabric controller. Note that the setup process asks you for a local administrator password and reboots the VM. The password is encrypted and stored in c:\unattend.xml for future unattended deployment.</li>
<li>Install and configure any application, role or update as you normally would.</li>
<li>Configure the windows firewall within the VM to open the ports that your application requires. It is recommended that you use fixed local ports.</li>
<li>Open and administrator command prompt and run c:\windows\system32\sysprep\sysprep.exe</li>
<li>Select &quot;OOBE&quot;, Generalize and Shutdown</li>
</ol>
<p><img height="252" style="margin: 5px" width="345" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/7282.rd.microsoftlab.net.jpg" /></p>
<p>This process removes any system-specific data (including the name and SID) from the image, in preparation for re-deployment on Azure. If your application is dependent on those data, you will have to take appropriate measures at startup on Azure (e.g. run a setup script for your application). The VHD is now ready to be uploaded. It is recommended to make a copy of it to keep as a template.</p>
<p>Note that any deployment to Azure starts from this vhd. No status is saved to local disk if the Azure VMs is recycled for any reason.</p>
<p><strong><span style="font-size:16pt">Step 3: Upload the VM to Azure</span></strong></p>
<p>For this you will need a command-line utility provided with the Azure SDK.</p>
<ol>
<li>Open a windows azure command prompt as administrator.</li>
<li>Type </li>
</ol>
<blockquote>
<p><span style="font-family: Courier; font-size: 10pt">csupload Add-VMImage -Connection &quot;SubscriptionId=&lt;YOUR-SUBSCRIPTION-ID&gt;; CertificateThumbprint=&lt;YOUR-CERTIFICATE-THUMBPRINT&gt;&quot; -Description &quot;&lt;IMAGE DESCRIPTION&gt;&quot; -LiteralPath &quot;&lt;PATH-TO-VHD-FILE&gt;&quot; -Name &lt;IMAGENAME&gt;.vhd -Location &lt;HOSTED-SERVICE-LOCATION&gt; -SkipVerify</span></p>
</blockquote>
<blockquote>
<p>The subscription ID can be retrieved from the Azure portal and the certificate thumbprint refers to the management certificate you created and uploaded before. The thumbprint can be retrieved from the portal as well. The description is an arbitrary string, the literal path is the full absolute path on the local disk where you stored your vhd. The image name is the name of the file once stored in Azure and the location is one of those available in the Azure portal. Note that the location must be specific, e.g. &quot;North Central US&quot;. A region is not accepted (e.g. Anywhere US). SkipVerify will save you some time.</p>
<p>This command will create a blob in configuration storage and load your vhd file in it for future use, but not create a service or start a VM for you. In the Azure portal the stored virtual machine templates can be found under &quot;VM Images&quot;</p>
</blockquote>
<p><img height="181" width="849" style="margin: 5px" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/7558.rd.microsoftlab.net.jpg" /></p>
<p></p>
<p><strong><span style="font-size:16pt">Step 4: Prepare the service model</span></strong></p>
<p>Azure requires a service definition and a service configuration file before deploying any role. These are .xml files that are packaged and uploaded to the fabric controller for interpretation. You can generate one for the VM using Visual Studio 2010.</p>
<p>1. Open Visual Studio 2010 and create a new Windows Azure project.</p>
<p>2. Do NOT add any role to the project from the project setup wizard. </p>
<p>3. In the solution explorer panel, right click on the project name and select New Virtual Machine Role. Note that a service may be made of several roles, including multiple VMs.</p>
<p><img height="274" style="margin: 5px" width="438" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/5582.rd.microsoftlab.net.jpg" /></p>
<p>4. In the VHD configuration dialog, specify your Azure account credentials and which of the stored virtual machine templates you'd like to use.</p>
<p><img height="218" style="margin: 5px" width="574" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/8308.rd.microsoftlab.net.jpg" /></p>
<p>5. In the Configuration panel specify how many instances you'd like and what type. Remember the size constraints on the system VHDs.</p>
<p><img height="218" style="margin: 5px" width="574" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/1440.rd.microsoftlab.net_2D00_1.jpg" /></p>
<p>6. In Endpoints, specify which ports and protocol must be open for your applications within the virtual machine (they should match those configured before).</p>
<p><img height="173" style="margin: 5px" width="704" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/3678.rd.microsoftlab.net_2D00_1.jpg" /></p>
<p>7. Note that RDP connections are configured elsewhere.</p>
<p>8. Once the VM role configuration is done, right-click on the project name and select Publish. You have an option to create the service configuration package only, to be uploaded later via the portal, or to actually deploy the project. I am assuming that you have not got a service defined yet. It is advisable to configure RDP connections for debugging purposes at least during staging.</p>
<p><img height="408" style="margin: 5px" width="727" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/4188.rd.microsoftlab.net_2D00_1.jpg" /></p>
<p>9. Select Enable connections, then specify a service certificate. This will contain a private key used to encrypt your credentials. If you have none, you can create one from this interface. If you do create a new certificate, click View, Details and Copy to File to export it. Make sure to include the private key. </p>
<p><img height="512" style="margin: 5px" width="412" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/6320.rd.microsoftlab.net_2D00_1.jpg" /></p>
<p>10. Specify a user name and password to connect to this virtual machine. Change the account expiration date as necessary (but set it before the certifcate expires).</p>
<p>11. Select &quot;Create Service Package Only&quot; and save the package file. </p>
<p><strong><span style="font-size:16pt">Step 5. Create the service in Azure</span></strong></p>
<p>1. In the Azure Management Portal, select Hosted Services / New Service</p>
<p>2. Populate the form, specifying a name for your service and deployment options. Note that the location you select must be the same specified at upload time for the virtual machine you want to use. Select the configuration package and file that you saved before. Add the certificate that you exported before for RDP.</p>
<p><img height="610" style="margin: 5px" width="533" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/2100.rd.microsoftlab.net_2D00_1.jpg" /></p>
<p>3. Click OK to deploy. Start your deployed machines.</p>
<p><img height="248" width="451" style="margin: 5px" alt="" src="http://blogs.technet.com/cfs-file.ashx/__key/CommunityServer-Blogs-Components-WeblogFiles/00-00-00-57-03-metablogapi/0083.rd.microsoftlab.net_2D00_1.jpg" /></p>
<p><strong><span style="font-size:16pt">Step 6: Connect and enjoy.</span></strong></p>
<p>From the machine where you generated the RDP certificate, connect to your virtual machines and test. Simply select the virtual machine in the Azure portal and click &quot;connect&quot;. A RDP file will be generated for you to save and open. Once debugging is finished, it is recommended to disable RDP connections for production.</p>
<p></p>
<p></p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3412242&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">AzureThe Hyper-V Cloud - no clusters?http://blogs.technet.com/b/gmarchetti/archive/2010/11/24/the-hyper-v-cloud-no-clusters.aspxThu, 25 Nov 2010 02:19:02 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:8df650ba-b64c-453d-8357-c31cd0cfaec4gmarchetti3http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3371118http://blogs.technet.com/b/gmarchetti/archive/2010/11/24/the-hyper-v-cloud-no-clusters.aspx#comments<p>Microsoft has recently published a set of guides to build your own private cloud solution using Hyper-V, System Center Virtual Machine Manager and its Self-Service Portal 2.0</p>
<p>They cover planning, deployment and operations. You can find them <a href="http://www.microsoft.com/virtualization/en/us/private-cloud-get-started.aspx">here.</a></p>
<p>Note however that the guides assume a deployment on stand-alone servers. There is no discussion of clustering for high availability.</p>
<p>I'd like to add a few observations derived from experience of running a private cloud implementation that includes clusters too.</p>
<p><strong>Storage</strong></p>
<p>It seems obvious, but local storage is not relevant for clusters. SCVMM will explicitly place virtual machines ONLY on cluster volumes (dedicated or shared). You can still manually create virtual machines on directly-attached disks, but it just complicates things, as they are not easily distinguishable from highly available ones in VMM. It makes sense to purchase systems with a hardware-mirrored boot volume only, plus whatever dedicated storage adapter is appropriate to your workload.</p>
<p>Clusters tend to drive high consolidation ratios to the storage, for the simple fact that it is shared amongst several nodes. The number of sustained IOps often becomes more relevant than the maximum theoretical bandwidth, as during VM operations you may often find that a lot of relatively small I/Os are performed with a random access pattern. Fibre channel may fit such a profile better than iSCSI. Ideally, you'll profile the load before consolidation, but in a private cloud / infrastructure-on-demand environment you may not have the luxury.</p>
<p>You'll be well advised to consult the storage vendor's planning guide beforehand.</p>
<p>Storage vendors often publish the IOps and bandwidth ratings of their arrays. For instance, for the <a href="http://h18000.www1.hp.com/products/quickspecs/13551_div/13551_div.html">HP P2000 G3</a>, you will find that 10 Gb/s iSCSI (with dedicated adapters) and 8 Gb/s FC are comparable in bandwidth utilization for large-block sequential reads and writes. However, FC still sustains 20% more IOps than iSCSI with a random 60/40 mix of read-write operations of 8KB blocks.</p>
<p>Interestingly, 6Gb SAS is equal or slightly better than FC in HP's measurements, which used 4 directly-attached servers (no fabric) for testing. Results may vary when a fabric is involved.</p>
<p><strong>Servers</strong></p>
<p>Blade servers have grown in popularity and are often recommended for private cloud solution, thanks to their good price / performance ratio, density and flexibility. However, in a highly available implementation due consideration must be given to:</p>
<p>- Connectivity: Microsoft recommends <em>at least</em> 4 network ports + 1 storage port (2 better) for each node in a hyper-v cluster. Blade connectivity may limit your options.</p>
<p>- I/O performance: i/o bandwidth and operations per second depend on the midplane capabilities. Chassis capable of full redundancy with non-blocking backplanes and 10 Gb/s per lane are available - at a price (e.g. HP <a href="http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00810839/c00810839.pdf">c7000</a>). Cheap solutions typically involve some element of oversubscription or no redundancy.</p>
<p>- Reliability of the shared components. For instance, in a study published on the <a href="http://portal.acm.org/citation.cfm?id=1660947">IBM Systems Journal</a>, the lowest mean time to failure (MTTF) belonged to the chassis switch modules, followed by the blade base board. An equivalent external Cisco switch lasts about twice longer between failures, according to published specifications. Even so, according to the same study, a blade availability can reach 99.99% and it is possible to build a 99.999% available infrastructure with blades by having at least 1 &quot;hot-standby&quot; in the chassis, in addition to using redundant components where possible.</p>
<p>High-end servers (e.g. <a href="http://www-03.ibm.com/systems/x/hardware/enterprise/">IBM x3850</a>) are also interesting virtualization platforms, because of hardware becoming more affordable and Windows Datacenter unlimited virtualization rights. They drive high consolidation and operating efficiency, mainly thanks to capacity up to 1TB RAM, 64 cores. They enhance the availability of the basic platform by providing features not commonly found elsewhere, like:</p>
<p>- <a href="http://en.wikipedia.org/wiki/Chipkill">ChipKill</a> or SDDC memory with hot add / replace.</p>
<p>- Hot-add / remove of i/o adapters.</p>
<p>- Hot-add CPUs.</p>
<p>Particular attention must be given to the memory configuration of the servers, especially with the recent NUMA chipsets from Intel and AMD:</p>
<p>- Populate processor local memory banks with equal amounts of RAM.</p>
<p>- Populate memory controllers with equal capacity.</p>
<p>- Populate all memory channels for each controller evenly to exploit the maximum memory bandwidth.</p>
<p>- Use dual-rank DIMMs.</p>
<p><strong>Network</strong></p>
<p>Whilst hyper-v will work with any supported network adapters, it is important to notice that certain features will be available only with the appropriate combination of chipsets.</p>
<p><a href="http://en.wikipedia.org/wiki/VMQ">VMQs</a> (hardware-managed network queues for VMs) are highly recommended for best performance and consolidation ratios, but they require support for interrupt coalescing (in R2), the latest Intel Pro or Broadcom chipsets, appropriate drivers and some registry hacking, as explained <a href="http://technet.microsoft.com/en-us/library/gg162696(WS.10).aspx">here</a> and <a href="http://technet.microsoft.com/en-us/library/gg162696(WS.10).aspx#BKMK_Tuning">here</a>.</p>
<p>VM Chimney (TCP offload for VMs) has proven unreliable due to driver issues in my experience. I'd rather have VMQs.</p>
<p>Note that if you enable IPSec or any filter driver on a particular connection (e.g. Windows firewall), that connection may not be offloaded.</p>
<p>Microsoft does NOT support network teaming with Hyper-V. Teaming is supported by the OEMs. VMQs and teaming may be mutually exclusive, depending on vendor.</p>
<p><strong>Operating System Editions</strong></p>
<p>In order to build fail-over clusters, you will need enterprise or datacenter editions. Note that <a href="http://www.microsoft.com/hyper-v-server/en/us/overview.aspx">Hyper-V Server R2</a> is also capable of clustering and similar in many respects to enterprise edition. The smaller footprint of Hyper-V Server R2 implies the need to patch it less often than a full edition, hence it is ideal to minimize planned downtime.</p>
<p><em>It is possible and legal to purchase datacenter edition (for the unlimited licensing) but deploy Hyper-V Server R2 (for the reduced footprint)</em>, <em>transferring the licenses.</em></p>
<p><strong>Cluster Size</strong></p>
<p>There are several considerations to determine the number of nodes and machine per node in a cluster:</p>
<p>- The officially supported maximum is 64 VMs per node in a hyper-v R2 cluster (increasing to 384 in SP1).</p>
<p>- The officially supported max virtual / physical core ratio is 8:1.</p>
<p>- Large clusters are more likely to incur in the WMI issues mentioned in my <a href="http://blogs.technet.com/b/gmarchetti/archive/2010/11/19/patches-and-kb-articles-for-hyper-v-r2.aspx">previous post</a>.</p>
<p>- It is fine to have the CPU capacity, how about the i/o? How many iops per node can you sustain with your adapter / SAN combination?</p>
<p><strong>Cluster Shared Volumes</strong></p>
<p>Assuming that you want live migration to minimize planned downtime and optimize allocation of resources, you will need CSVs. A common question is how many machines to deploy on each CSVand how many CSVs to have per cluster. The answer to that depends on several <a href="http://itinfras.blogspot.com/2010/07/factors-that-influence-how-many-cluster.html">factors</a>. In my experience, the most troublesome one is the data you need to back up and how long you can tolerate reduced performance during the backup.</p>
<p>Each time you perform a CSV backup, the server hosting the VM to back up requires ownership of the whole CSV volume to snapshot. I/O to the volume is redirected over the network for all VMs hosted on that volume, with consequent performance impact. The time you can tolerate that, multiplied by the backup throughput will give you the max amount of data to put on that CSV. Divide that by the average size of a VHD and you'll have a rough estimate of how many VMs will fit.</p>
<p><strong>Rules of Thumb</strong></p>
<p>I claim no scientific basis for the following rules, other than my empirical observations. Here they go, in no particular order:</p>
<p>1. Keep the number of nodes in a cluster small (2-4) to avoid annoying SCVMM bugs.</p>
<p>2. Assume a random 60/40 i/o pattern if you don't know in advance what your VM workload will be. It is quite common, in my experience.</p>
<p>3. Plan for at least 1 management network, 1 for heartbeat, 1 for live migration, 1 dedicated to VMs and 1 dedicated storage adapter. For higher consolidation ratios and availability, plan on 2 storage adapters with MPIO.</p>
<p>4. Use separate VLANs for management and VMs to isolate traffic and for ease of administration. Consider 10GbE for the shared VM networks in order to minimize the number of adapters and cables.</p>
<p>5. The quality of VSS (snapshot) and VDS (disk) providers varies greatly with the OEM. Be sure to test them. A snapshot should NOT take longer than 10 seconds or it will fail.</p>
<p>6. If you don't know how long you can tolerate redirected i/o, 30 minutes is a useful maximum. I have seen CSVs crash in several occasions (again, depending on OEM) after that.</p>
<p>7. Fail-over clusters do NOT take into account connectivity to the VMs. In other words, if you are sharing the VM network with the host o/s and for some reason that connection fails, the cluster may fail over to another node its own IP addresses on that network but not the VMs attached to it.</p>
<p>8. A few large servers are easier to manage than many small blades, if you implement appropriate procedures to minimize downtime and have a support contract when things go wrong to fix them quickly :-) They may also be more cost-effective, if you drive consolidation and take advantage of power optimization technology.</p>
<p>9. Use group policies to control patching with WSUS or similar. Do NOT use the default &quot;download and install at 3am&quot; option on all cluster nodes, or they will all reboot at the same time.</p>
<p>10. If you don't know your storage vendor's iops ratings, use these ballpark figures on <a href="http://en.wikipedia.org/wiki/IOPS">wikipedia</a>.</p>
<p>11. On your hosts, make sure that the antivirus excludes .vhd, .vsv, .avhd files and vmms.exe, vmwp.exe. If you are running hyper-v server only, do you need an antivirus on the host? This is not a rhetorical question by the way; I am interested in opinions.</p>
<p>12. If you don't know the size of your CSVs in advance, 2 TB works on all MBR and GPT disks. Most backup and restore utilities, snapshot providers etc... can handle 2 TB. It is also a tolerable size should you ever need to run chkdsk or defrag on the volume (few, large vhd files should not cause much trouble or take much time to fix in that respect).</p>
<p>13. Both fail-over and PRO have no idea of virtual applications, i.e. applications that require a set of interconnected virtual machines. They may move them on different nodes. A way around it is to script the appropriate migration sequence with Powershell.</p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3371118&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Dynamic DatacenterVirtualizationPatches and KB articles for Hyper-v R2http://blogs.technet.com/b/gmarchetti/archive/2010/11/19/patches-and-kb-articles-for-hyper-v-r2.aspxSat, 20 Nov 2010 01:25:21 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:b4393a89-9a4d-4f79-96dc-1228e3f61f0egmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3369983http://blogs.technet.com/b/gmarchetti/archive/2010/11/19/patches-and-kb-articles-for-hyper-v-r2.aspx#comments<p>I'd like to share with you some tips to mitigate a few frustrating intermittent problems with hyper-v and system center.</p>
<p><b>1. Data Protection Manager 2010</b></p>
<p>BEFORE installing DPM agents on systems running Hyper-V R2, you must install hotfixes&nbsp;<b><b>KB975921</b>&nbsp;and KB975354</b>. Note that if you had a beta or rc version of dpm, upgrading the dpm server will NOT upgrade the agents and you will not be prompted to do so. You will only see "inconsistent replicas".&nbsp;</p>
<p>The first KB refers to a situation where the volume snapshot provider for your clustered volume crashes and does not return a "completed" signal. Alas, the volume is not returned to its original state, the cluster resource fails and all its virtual machines with it. You may be lucky and never encounter this problem; alas the quality of snapshot providers varies widely so I suggest you install the patch anyway.</p>
<p>The second KB refers to a common scenario where you want to backup virtual machines running on different nodes of a cluster at the same time. If the machines sit on the same cluster volume, ownership of the volume must be transferred to the node requesting the snapshot at that point. Alas this transfer may happen before the post-snapshot steps of a virtual machine backup are complete, so the replica is inconsistent. If at all possible, I suggest to build the DPM protection groups and time the backups in such a way that frequent transfers of volumes are avoided.</p>
<p><b>2. Virtual Machine Manager and Operations Manager</b></p>
<p>VMM and OM agents rely on WMI heavily. When you consolidate dozens of virtual machines on a cluster and want to take advantage of the PRO functionality (hence run both agents on all nodes), the WMI service is heavily loaded and may crash. It restarts without too much fuss and no data is lost, but anything depending on it fails. In your VMM console you may notice that all machines on a node seem to fail at the same time, or that the connection to the VMM agent running on that node times out. Operations manager will also issue critical alerts. Live migrations and deployments may be interrupted. To mitigate the problem, you must install <b>KB974930 and KB981314</b>. </p>
<p>The first KB addresses a memory leak of the Win32_Service WMI class. The second KB addresses timeouts in WMI queries about a failover cluster object. </p>
<p>You may also want to increase the number of concurrent connections and the timeout period for the Windows Remote Management service (winRM). To do so, you could run this script on all nodes of the cluster:</p>
<p><i>winrm set winrm/config/Service @{MaxConcurrentOperationsPerUser="400"}</i></p>
<p><i>winrm set winrm/config @{MaxTimeoutms = "1800000"}</i></p>
<p><i>net stop vmmagent</i></p>
<p><i>net stop winrm</i></p>
<p><i>net start winrm</i></p>
<p><i>net start vmmagent</i></p>
<p>The script increases the number of concurrent WM operations to 400 (the default is 200) and sets the default timeout to 30 minutes (it should be plenty). Note that this is just the default timeout - operations may specify their own.</p>
<p>For those values to be taken into consideration, the script then restarts the vmmagent and winrm services.</p>
<p>In my experience applying these fixes reduces the frequency and duration of such timeouts, but does not eliminate them completely. You may have to tweak the numbers over time to improve the situation further.</p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3369983&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Live Migration, Cluster Shared Volumes & Networkshttp://blogs.technet.com/b/gmarchetti/archive/2009/10/01/live-migration-cluster-shared-volumes-networks.aspxFri, 02 Oct 2009 02:42:49 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:10e0f4e3-74e3-4880-a520-cf09f4661753gmarchetti1http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3284469http://blogs.technet.com/b/gmarchetti/archive/2009/10/01/live-migration-cluster-shared-volumes-networks.aspx#comments<p>The recommendation for people setting up live migration clusters is to isolate different kinds of traffic on their own networks:</p> <ol> <li>Public network to access the cluster and the virtual machines running on it</li> <li>“Private” cluster heartbeat network</li> <li>“live migration” network</li> <li>iSCSI network, if required to access shared storage</li> </ol> <p>How do we determine what traffic goes where?</p> <p>For public and private, the failover cluster manager tool is self-explanatory:</p> <p><a href="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/LiveMigrationClusterSharedVolumesNetwork_E746/image_2.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/LiveMigrationClusterSharedVolumesNetwork_E746/image_thumb.png" width="205" height="244" /></a> </p> <p>We select the appropriate cluster network properties. If we want to limit such network to private traffic, we do not allow clients to connect through it.</p> <p>If we don’t want the cluster to use such network at all, e.g. because it is dedicated to iSCSI, we select the “Do not allow…” button.</p> <p>How about the live migration traffic, though? It can be quite heavy, as we are copying memory pages from one host to another. We can select in which order to use cluster networks for such traffic through the failover cluster manager</p> <p><a href="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/LiveMigrationClusterSharedVolumesNetwork_E746/image_4.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/LiveMigrationClusterSharedVolumesNetwork_E746/image_thumb_1.png" width="480" height="369" /></a> </p> <p>The property requires some digging: expand “services and applications”, select the virtual machine in question, then in the main panel right-click on “virtual machine &lt;name&gt;” and you’ll see tab called “network for live migration”. You can then select and sort in order of priority the networks that you want to use. By default, live migration will select a network that is NOT used for CSV traffic. Note that you may have networks in this panel that were not selected for cluster use before. If you use iSCSI, de-select the relevant entry to make sure that the live migration traffic does not go through that network.</p> <p>This brings me to cluster shared volumes. One of the great features of CSVs is that if the storage link (iSCSI, fibre) becomes unavailable for any reason on a node, storage traffic can be redirected over the cluster network to another node and hence to the storage device. But which cluster network?</p> <p>Inter-node communications and CSV traffic will use the available network authorized for cluster use that has the lowest metric value. We can see the metrics with old cluster.exe</p> <p><font size="1" face="Courier">C:\Windows\system32&gt;cluster net /prop <br />Listing properties for all networks: </font></p> <p><font size="1" face="Courier">T&#160; Network&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Name&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Value <br />-- -------------------- ------------------------ ----------------------- <br />SR Cluster Network 1&#160;&#160;&#160; Name&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Cluster Network 1 <br />MR Cluster Network 1&#160;&#160;&#160; IPv6Addresses&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;cut on purpose&gt;&#160; <br />MR Cluster Network 1&#160;&#160;&#160; IPv6PrefixLengths&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />MR Cluster Network 1&#160;&#160;&#160; IPv4Addresses&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;cut on purpose&gt; <br />MR Cluster Network 1&#160;&#160;&#160; IPv4PrefixLengths&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />SR Cluster Network 1&#160;&#160;&#160; Address&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />SR Cluster Network 1&#160;&#160;&#160; AddressMask&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />S&#160; Cluster Network 1&#160;&#160;&#160; Description <br />D&#160; Cluster Network 1&#160;&#160;&#160; Role&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 3 (0x3) <br />D&#160; Cluster Network 1&#160;&#160;&#160; Metric&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 10001 (0x2711) <br />D&#160; Cluster Network 1&#160;&#160;&#160; AutoMetric&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 0 (0x0) <br />SR Cluster Network 2&#160;&#160;&#160; Name&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Cluster Network 2 <br />MR Cluster Network 2&#160;&#160;&#160; IPv6Addresses <br />MR Cluster Network 2&#160;&#160;&#160; IPv6PrefixLengths <br />MR Cluster Network 2&#160;&#160;&#160; IPv4Addresses&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />MR Cluster Network 2&#160;&#160;&#160; IPv4PrefixLengths&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />SR Cluster Network 2&#160;&#160;&#160; Address&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />SR Cluster Network 2&#160;&#160;&#160; AddressMask&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />S&#160; Cluster Network 2&#160;&#160;&#160; Description <br />D&#160; Cluster Network 2&#160;&#160;&#160; Role&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 1 (0x1) <br />D&#160; Cluster Network 2&#160;&#160;&#160; Metric&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 1000 (0x3e8) <br />D&#160; Cluster Network 2&#160;&#160;&#160; AutoMetric&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 1 (0x1) </font></p> <p>Note the 3 values:</p> <ul> <li>Role: 1 for a private network, 0 for ignored by cluster, 3 for mixed traffic</li> <li>Metric: the “weight” of the connection, generally in the 10,000 range for public networks, 1,000 for private ones. If a network has a default gateway, it is considered public; if not, private. Should there be more than one private or public network, the metric is incremented by 100 in order of enumeration (e.g. private network 2 will have a default metric of 1,100)</li> <li>Autometric: 1 if the metric is set automatically by the cluster, 0 if you have set it manually.</li> </ul> <p>So in my simple case the heartbeat network will also be used for CSV traffic. If you have more than 1 private network and you want to prioritize them, you can set the metric with cluster.exe, e.g.</p> <p><font size="1" face="Courier">C:\Windows\system32&gt;cluster net &quot;Cluster Network 2&quot; /prop metric=1001 </font></p> <p><font size="1" face="Courier">C:\Windows\system32&gt;cluster net &quot;Cluster Network 2&quot; /prop </font></p> <p><font size="1" face="Courier">Listing properties for 'Cluster Network 2': </font></p> <p><font size="1" face="Courier">T&#160; Network&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Name&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Value <br />-- -------------------- ------------------------ ----------------- <br />SR Cluster Network 2&#160;&#160;&#160; Name&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; Cluster Network 2 <br />MR Cluster Network 2&#160;&#160;&#160; IPv6Addresses <br />MR Cluster Network 2&#160;&#160;&#160; IPv6PrefixLengths <br />MR Cluster Network 2&#160;&#160;&#160; IPv4Addresses&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />MR Cluster Network 2&#160;&#160;&#160; IPv4PrefixLengths&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />SR Cluster Network 2&#160;&#160;&#160; Address&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />SR Cluster Network 2&#160;&#160;&#160; AddressMask&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; &lt;..&gt; <br />S&#160; Cluster Network 2&#160;&#160;&#160; Description <br />D&#160; Cluster Network 2&#160;&#160;&#160; Role&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 1 (0x1) <br />D&#160; Cluster Network 2&#160;&#160;&#160; Metric&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 1001 (0x3e9) <br />D&#160; Cluster Network 2&#160;&#160;&#160; AutoMetric&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; 0 (0x0)</font></p> <p>Redirection of the traffic is automatic: if a network becomes unavailable, the next-lowest-metric one will be used. If another network with a lower metric becomes available, it will be used from that point onwards.</p> <h4>In Summary</h4> <p>By default, live migration traffic will be put on the network with the second-lowest metric. CSV traffic will be put on the the network with the lowest metric. In this simple example, I just have a public and private network, so the public one is used for live migration and the private one for csv and cluster traffic.</p><img src="http://blogs.technet.com/aggbug.aspx?PostID=3284469&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">VirtualizationHigh AvailabilityHyper-VP2V with SCVMM – a quick tiphttp://blogs.technet.com/b/gmarchetti/archive/2009/07/21/p2v-with-scvmm-a-quick-tip.aspxWed, 22 Jul 2009 00:54:53 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:35b040ab-c2e0-47bb-8bf8-ac8d2320b723gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3266913http://blogs.technet.com/b/gmarchetti/archive/2009/07/21/p2v-with-scvmm-a-quick-tip.aspx#comments<p>System Center Virtual Machine Manager (SCVMM) has been offering a relatively simple way of doing physical-to-virtual migrations (P2V) for a while. You just click on the “Convert Physical Server” icon and off you go. Despite the name, it also works with client target machines. It’s simple, if you do some preparation work before.</p> <p><a href="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/P2VwithSCVMM_D0EA/image_2.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/P2VwithSCVMM_D0EA/image_thumb.png" width="244" height="157" /></a> </p> <p>In fact, VMM will ask you for name or ip address of the machine in question and for administrator credentials on it. Those will be used to reach the machine and install a p2v agent on it. For the process to work correctly, you must let through the firewall of the target machine:</p> <ul> <li>WMI traffic</li> <li>http</li> <li>file and print</li> <li>remote management</li> </ul> <p>Also, make sure that the ADMIN$ share exists and start the Windows remote management service on the target machine. </p> <p>By default, most of these ports and services are closed.</p><img src="http://blogs.technet.com/aggbug.aspx?PostID=3266913&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">VirtualizationWindows Server 2008Hyper-VVHD Boothttp://blogs.technet.com/b/gmarchetti/archive/2009/07/14/vhd-boot.aspxWed, 15 Jul 2009 01:35:30 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:8e0fc341-98fa-4609-a252-96906c707dd2gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3264126http://blogs.technet.com/b/gmarchetti/archive/2009/07/14/vhd-boot.aspx#comments<p>With Windows 7 and Server 2008 R2 you get the opportunity to boot directly from a vhd file. The operating system in the vhd file will have direct access to the machine hardware. It will not run as a virtual machine with synthetic or emulated adapters, but as a “real machine”. VHD happens to be the format that is used to represent a disk to the o/s. The physical disk will contain a set of vhd files and still be visible as a disk to the o/s you boot. Thus, you won’t require a partition per o/s.</p> <p>Assuming that you are running Windows 7 or 2008 R2, here’s how you can set it up: </p> <p><strong>1. Create a vhd file to contain your o/s.</strong> </p> <p>I found that 15 GB are enough for Server 2008 R2 + Hyper-V role (you can enable any role on the o/s in that VHD). You can use diskpart from the command line or the disk management tool. Make sure to <strong>select a fixed disk size</strong>.</p> <p><a href="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/VHDBoot_B18C/image_2.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" border="0" alt="image" src="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/VHDBoot_B18C/image_thumb.png" width="206" height="244" /></a> </p> <p>Mount that vhd file, e.g. to drive letter W:</p> <p><strong>2. Apply a WIM image to the VHD file you just created.</strong></p> <p>You can generate a WIM image of a pre-installed “golden” machine with the imagex tool, part of the Windows Automated Installation Kit (WAIK)</p> <p>You can use the WIM image provided with the Windows installation media in sources\install.wim</p> <p>If all you have is an iso file (e.g. a 2008 R2 evaluation version you’ve just downloaded), there are utilities like MagicISO which will let you mount it as a disk, so you can use the install.wim within the file.</p> <p>The simplest way to apply the wim image is to use the Install-Windowsimage powershell script, which you’ll find on <a href="http://code.msdn.microsoft.com/InstallWindowsImage">MSDN</a>. The installation media contains several versions of Windows, so make sure you select the one you are licensed for. </p> <p>In powershell, type:</p> <p>.\Install-WindowsImage.ps1 -WIM D:\Sources\Install.wim</p> <p>to obtain a list of the available images on your installation DVD (D:\). Note the index number of the image you are interested in.</p> <p>Type </p> <p>.\Install-WindowsImage.ps1 -WIM D:\Sources\Install.wim -Apply -Index 3 -Destination W:\</p> <p>to apply the 3rd image on the VHD drive you mounted previously (W:\)</p> <p><strong>3. Make the VHD bootable</strong></p> <p>Open a command prompt and type:</p> <p>W:\windows\system32\bcdboot w:\windows</p> <p>Bcdboot creates the boot control data (bcd) block to boot Windows from the vhd file and sets it as default option.</p> <p><strong>4. Check the bcd entry and set your preferred default</strong></p> <p>At an administrator’s command prompt, type:</p> <p>bcdedit /v</p> <p><a href="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/VHDBoot_B18C/image_4.png"><img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://blogs.technet.com/blogfiles/gmarchetti/WindowsLiveWriter/VHDBoot_B18C/image_thumb_1.png" width="244" height="232" /></a> </p> <p>Note the identifier of the entry that you want to use as default boot option.</p> <p>Type:</p> <p>bcdedit /default {identifier}</p> <p>to set the default.</p> <p>That’s it – you can now boot from your vhd file.</p> <p>You need not stop here, however: there is no need to start from a regular o/s installation: you can configure vhd boot from the installation media and have all your Windows o/s boot from vhd files. Keith Combs explains how on his <a href="http://blogs.technet.com/keithcombs/archive/2009/05/22/dual-boot-from-vhd-using-windows-7-and-windows-server-2008-r2.aspx">blog</a>.</p><img src="http://blogs.technet.com/aggbug.aspx?PostID=3264126&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">VirtualizationA Free Book on Microsoft Virtualizationhttp://blogs.technet.com/b/gmarchetti/archive/2009/03/12/a-free-book-on-microsoft-virtualization.aspxFri, 13 Mar 2009 02:01:59 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:9b44fdf7-010e-40ca-bef5-bfa0db62a233gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3212219http://blogs.technet.com/b/gmarchetti/archive/2009/03/12/a-free-book-on-microsoft-virtualization.aspx#comments<p><strong>Understanding Microsoft Virtualization Solutions</strong> from Microsoft Press is available as a FREE download.&#160;&#160; </p> <p>This 15MB E-Book gives an overview of all current Microsoft Virtualization technologies: Hyper-V, the Microsoft Enterprise Desktop Virtualization (MED-V), and VDI. It also describes which management solutions are available for them (e.g. System Center Virtual Machine Manager) and how they fit together. It is worth reading when planning the virtualization of your infrastructure.</p> <p>You can find it here: <a title="http://csna01.libredigital.com/?urmvs17u33" href="http://csna01.libredigital.com/?urmvs17u33">http://csna01.libredigital.com/?urmvs17u33</a></p><img src="http://blogs.technet.com/aggbug.aspx?PostID=3212219&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">VirtualizationHyper-VUpdated Infiniband on Server 2008 Paperhttp://blogs.technet.com/b/gmarchetti/archive/2009/03/02/updated-infiniband-on-server-2008-paper.aspxTue, 03 Mar 2009 04:31:21 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:0d22a39b-a682-437f-89f8-26c3a43bc633gmarchetti1http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3208509http://blogs.technet.com/b/gmarchetti/archive/2009/03/02/updated-infiniband-on-server-2008-paper.aspx#comments<p>I have finally updated my notes on the installation of Infiniband on Windows Server 2008. They now cover the released version 2.0 of Mellanox <a href="http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=32&menu_section=34">WinOF</a> stack. You can find the document in my <a href="http://cid-a7eb7d62d4966068.skydrive.live.com/browse.aspx/Public?authkey=QVGZAxZiQ98%24">skydrive public folder</a>.</p>
<p>Let me know if you find it useful.</p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<img src="http://blogs.technet.com/aggbug.aspx?PostID=3208509&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">HPCFaking Networkshttp://blogs.technet.com/b/gmarchetti/archive/2009/02/23/faking-networks.aspxTue, 24 Feb 2009 01:46:26 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:9d7cc7d9-4eb1-4799-a9fb-6082c38d1dc2gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3205884http://blogs.technet.com/b/gmarchetti/archive/2009/02/23/faking-networks.aspx#comments<p>On a Windows HPC Server 2008 head node, that is...</p>
<p><strong>1. No Infiniband on the head node</strong></p>
<p>In many cases people want to save themselves some money by not installing an Infiniband adapter on the head node, thereby also sparing a port on that expensive infiniband switch. It makes a lot of sense, especially when you plan not to perform any calculations on such machine. So, how do we make the software believe it has an Infiniband adapter?</p>
<p>The HPC management tools do not care too much about the type of connection you have, as long as they can get an IP address to communicate with. So, you can install a &quot;loopback adapter&quot;, give it a fixed IP address and pretend it is a real network card. Of course, you will not be able to use it to communicate with the compute nodes, but if all you want to carry on IB is MPI traffic amongst those, the trick will work.</p>
<p>The only caveat is that you lose the ability to use dhcp on the infiniband network, hence you will have to provide a mechanism to assign fixed IP addresses for IPoIB communication. Of course the subnet you use on the &quot;fake&quot; IB and the real one must be the same.</p>
<p>The easiest way is possibly to write a small script that uses the <a href="http://technet.microsoft.com/en-us/library/cc785383.aspx"><span style="font-family: Courier; font-size: 12pt">netsh</span></a> command, then run it on all the compute nodes. You will need at least 1 private Ethernet network for management traffic across the cluster.</p>
<p>For instance, the command below will assign the ip address 192.168.3.100 and a 24-bit mask to the network connection called &quot;Application&quot;</p>
<p><span style="font-family: Courier; font-size: 10pt">netsh int ip set address &quot;Application&quot; static 192.168.3.100 255.255.255.0</span></p>
<p><strong>2. No public ethernet</strong></p>
<p>In several cases I found that the head node has only 1 ethernet card. Our HPC software out of the box prevents the use of Windows Deployment Services, DHCP unless you have at least 2 adapters, in order to avoid conflicts with existing deployment solutions. You may choose to install a fake &quot;public&quot; network on a loopback adapter and thus enable WDS on the real &quot;private&quot; network.</p>
<p><strong>3. No private ethernet</strong></p>
<p>Another interesting case is the one you get with many pre-built clusters, which provide 1 Ethernet and 1 Infiniband network in the box.</p>
<p>Note that when you install an Infiniband stack (e.g. WinOF 2.0), you typically get an IP-over-IB protocol provider. Thus, it is possible to use the infiniband network to route private cluster traffic, with the exception of deployment (no PXE-boot over IB). For &quot;heavy&quot; mpi applications, you will want to keep the two networks separate anyway. </p>
<p style="color:#008;text-align:right;"><small><em>Powered by</em> <a href="http://www.qumana.com/">Qumana</a></small></p>
<img src="http://blogs.technet.com/aggbug.aspx?PostID=3205884&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">HPCLive Migration in R2http://blogs.technet.com/b/gmarchetti/archive/2009/02/12/live-migration-in-r2.aspxThu, 12 Feb 2009 21:00:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:2b3796d5-2da6-4e74-8d18-c5075f8af6a3gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3201396http://blogs.technet.com/b/gmarchetti/archive/2009/02/12/live-migration-in-r2.aspx#comments<p>I've got a lot of questions about Live Migration in 2008 R2. Rather than writing a long post on it, I thought I'd point you at some resources I found useful whilst setting up my test environment, so you can build one too:</p><span class="Apple-style-span" style="color: rgb(51, 51, 51); font-family: Tahoma;"><div>-&nbsp;<font><a href="https://cid-a7eb7d62d4966068.skydrive.live.com/browse.aspx/Public/Virtualization" style="color: rgb(72, 138, 203); text-decoration: none;" mce_href="https://cid-a7eb7d62d4966068.skydrive.live.com/browse.aspx/Public/Virtualization">Frank Cicalese's (virtualization specialist) paper on r2 lab setup</a></font></div><div>-&nbsp;<font><a href="http://technet.microsoft.com/en-us/library/dd446679.aspx" style="color: rgb(72, 138, 203); text-decoration: none;">Technet step-by-step guide to live migration</a></font></div><div>-&nbsp;<font><a href="http://technet.microsoft.com/en-us/library/cc732181.aspx" style="color: rgb(72, 138, 203); text-decoration: none;" mce_href="http://technet.microsoft.com/en-us/library/cc732181.aspx">Technet guide to hyper-v fail-over</a></font></div><div><font><a href="http://technet.microsoft.com/en-us/library/cc732181.aspx" style="color: rgb(72, 138, 203); text-decoration: none;" mce_href="http://technet.microsoft.com/en-us/library/cc732181.aspx"></a><span class="Apple-style-span" style="color: rgb(0, 0, 0); font-family: Arial;"><span class="Apple-style-span" style="color: rgb(51, 51, 51); font-family: Tahoma;">-&nbsp;<font><a href="http://www.virtual-strategy.com/Features/Tutorial-Microsoft-HyperV-Server-R2-on-BladeCenter-S.html" style="color: rgb(72, 138, 203); text-decoration: none;" mce_href="http://www.virtual-strategy.com/Features/Tutorial-Microsoft-HyperV-Server-R2-on-BladeCenter-S.html">Massimo Re Ferre` (IBM) tutorial on Hyper-V Server R2</a></font></span>&nbsp;</span></font></div><div>- <a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=fdd083c6-3fc7-470b-8569-7e6a19fb0fdf&amp;DisplayLang=en%20" mce_href="http://www.microsoft.com/downloads/details.aspx?FamilyID=fdd083c6-3fc7-470b-8569-7e6a19fb0fdf&amp;DisplayLang=en ">2008 R2 Live Migration Architecture Guide</a></div><div>&nbsp;</div><div>I also recorded the steps to build such environment in a series of screencasts that will be appearing on http://edge.technet.com. The first one is already there, so check it out. They are short out of necessity, so it will take a couple of weeks for all of them to appear.</div><!--StartFragment--></span><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3201396&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Hyper-VTurning hyper-v on and offhttp://blogs.technet.com/b/gmarchetti/archive/2008/12/07/turning-hyper-v-on-and-off.aspxMon, 08 Dec 2008 00:19:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:fc9c01a4-fe26-4f3a-a5ea-ccb474d9bf67gmarchetti3http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3164875http://blogs.technet.com/b/gmarchetti/archive/2008/12/07/turning-hyper-v-on-and-off.aspx#comments<P>I use hyper-v on my laptop. When I know I don't need&nbsp;VMs for the day, I can squeeze a bit more performance out of the machine by turning hyper-v off with:</P>
<P>bcdedit /set hypervisorlaunchtype off</P>
<P>and a reboot. To turn it back on:</P>
<P>bcdedit /set hypervisorlaunchtype on (or auto start)</P>
<P>and reboot. </P><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3164875&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">VirtualizationWindows Server 2008What is new in virtualization with Windows Server 2008 R2?http://blogs.technet.com/b/gmarchetti/archive/2008/11/12/what-is-new-in-virtualization-with-windows-server-2008-r2.aspxThu, 13 Nov 2008 01:43:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:a0295a7d-0484-49b3-9c8d-3fa9022133c9gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3152012http://blogs.technet.com/b/gmarchetti/archive/2008/11/12/what-is-new-in-virtualization-with-windows-server-2008-r2.aspx#comments<p>There are some quite interesting improvements in Windows Server 2008 R2 (what was wrong with W7 as a name?) that help us progress toward a dynamic infrastructure. Three of them are worthy of highlighting: live migration of virtual machines in hyper-v, cluster shared volumes and core parking. </p><h3>1. Live Migration</h3> <p>Live migration refers to the ability of moving a running virtual machine from one host server to another without loss of service. For this to happen, we have to transfer the current virtual machine state and memory pages between machines and we have to warrant both servers the same level of access to the virtual machine files. The process can be summarized as follows: </p><ol> <li>Create a virtual machine on the target server </li><li>Copy the memory pages of the running virtual machine in question from the source to the target server via Ethernet. While we copy, those memory pages may change, so after an initial pass we have to go back and copy the changed set again, until a minimum threshold number of pages is reached. It is hard to fix the threshold: ideally, it will be the number of pages that can be copied within a TCP connection timeout, so the clients won’t notice. </li><li>Pause the source machine; copy its state across. </li><li>Resume the target machine, issue ARP command to update routing tables. </li></ol> <p>For (3) to happen quickly and transparently to the clients, the target server must have immediate access to the virtual machine files. It cannot wait for a disk volume to fail-over and possibly go through file system checks. That’s where cluster shared volumes come in. </p><h3>2. Cluster Shared Volumes</h3> <p>Cluster Shared Volumes enable concurrent access to the same LUN by several nodes. Consequently, all the nodes see the same NTFS file-system and namespace. By the way, CSV is not a parallel or a cluster file system. It was designed with the live migration scenario in mind.</p><p>Since the host servers already mount the CSV, there is no need to arbitrate for disk access and fail over the volume hosting the virtual machine files. All you need to do is transfer ownership of those files and their locks to the target server. </p><p>CSVs are implemented via a filter driver mechanism, which is used to establish the access path to the underlying LUNs. This also enhances our fail-over ability, as file system requests will be redirected over the network to another server if a direct SAN access is no longer available. </p><h3>3. Core idling or parking</h3> <p>Changes in Windows 7 power management allow for “density” scheduling, i.e. minimizing the number of processor cores on which work is done, hence maximizing their utilization. The idle cores can be put to sleep (low-power state Cx under the ACPI specifications), thus reducing power consumption. Hyper-V can take advantage of this feature and schedule its virtual machines accordingly. Power management policies can be controlled via WMI, policies and scripts. </p><p>If you combine “density” scheduling with the ability to move virtual machines among hosts, you achieve quite a scalable, efficient and dynamic solution to the distributed resource allocation problem. Now, all that remains to do is automate it. Stay tuned. </p><h3>4. References</h3> <p>ACPI explanation on <a mce_href="http://en.wikipedia.org/wiki/Advanced_Configuration_and_Power_Interface" href="http://en.wikipedia.org/wiki/Advanced_Configuration_and_Power_Interface">Wikipedia</a> </p><p>WinHEC 2008 conference <a mce_href="http://www.microsoft.com/whdc/winhec/2008/papers.mspx" href="http://www.microsoft.com/whdc/winhec/2008/papers.mspx">whitepapers</a> </p><p>Engineering Windows 7 <a mce_href="http://blogs.msdn.com/e7/" href="http://blogs.msdn.com/e7/">blog</a> </p><p>The Windows <a mce_href="http://windowsteamblog.com/blogs/" href="http://windowsteamblog.com/blogs/">blog</a></p><p>The Windows Server 2008 R2 Reviewers' Guide http://www.microsoft.com/windowsserver2008/en/us/r2.aspx </p><img src="http://blogs.technet.com/aggbug.aspx?PostID=3152012&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">Windows Server 2008Hyper-VUpgrading from an evaluation versionhttp://blogs.technet.com/b/gmarchetti/archive/2008/11/03/upgrading-from-an-evaluation-version.aspxTue, 04 Nov 2008 02:39:00 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:e984358e-8c24-4d1c-b95c-36da392f1093gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3146793http://blogs.technet.com/b/gmarchetti/archive/2008/11/03/upgrading-from-an-evaluation-version.aspx#comments<p>I have received a few questions about upgrades from the evaluation version that you can download from microsoft.com/hpc to a full version.</p><p>The good news is that the evaluation version is fully functional, so you won't need a complete re-installation. The only thing you need to do is obtain a full licence key, then: <br></p><p>- To upgrade the hpc pack tools you have to run “upgrade.exe” on the head node. The hpc pack CD contains the upgrade.exe file.</p><p>
- To upgrade the o/s, you have to obtain a full licence key for all the nodes, then run slmgr.vbs –ipk &lt;new licence&gt; <licence key=""> across the cluster. You can do that from the command line (clusrun /all) or via the GUI.
</licence></p><p>You can also use slmgr.vbs to extend the evaluation period by another 60 days. When you are approaching the end of the evaluation, simply run slmgr.vbs -rearm across the cluster. Note that the evaluation does not require activation, but a full licence does. </p><p>Please see http://support.microsoft.com/kb/948472 for more information. <br></p><div style="clear:both;"></div><img src="http://blogs.technet.com/aggbug.aspx?PostID=3146793&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">HPCWindows Server 2008Proxies and Compute Nodeshttp://blogs.technet.com/b/gmarchetti/archive/2008/10/22/proxies-and-compute-nodes.aspxThu, 23 Oct 2008 00:22:40 GMTd5e57398-b9ef-4490-9955-07cbb4e4a80d:7f8ef34c-0452-49a9-8026-654b7ec22c77gmarchetti0http://blogs.technet.com/b/gmarchetti/rsscomments.aspx?WeblogPostID=3140501http://blogs.technet.com/b/gmarchetti/archive/2008/10/22/proxies-and-compute-nodes.aspx#comments<p>You’ve prepared your templates, configured your network, your firewalls and everything you could think of, yet your automated provisioning takes forever and eventually fails…</p> <p>Well, check if you have a patching task in your node template. If you do, you’ll need a way to reach the Microsoft Update service and download any patches. You may need to set a proxy on the nodes for that. Alas, the GUI interface does not offer you an option to do that. Also, any proxy setting that you specify in Internet Options is effective just for the logged-in user. So, how can you set a proxy for windows update to use?</p> <p>The Windows Update service uses the WinHTTP protocol. You can set a protocol-level proxy with:</p> <p>netsh winhttp set proxy proxy-server=”http=&lt;your proxy:port&gt;” bypass-list=”&lt;local&gt;”</p> <p>Where &lt;local&gt; is typed literally &lt;local&gt;. You could have that command line run before the patching task in the template.</p> <p>Alternatively, you could deploy the nodes without the patching task, run that command across the cluster, then apply a template with a patching task. </p> <p>Last but not least, you could set up a Windows Update Server on your corporate network and then use group policies to direct the update service on the nodes to that server.</p> <p>Anyway, if your nodes go anywhere near the Internet, please keep them patched!</p><img src="http://blogs.technet.com/aggbug.aspx?PostID=3140501&AppID=5703&AppType=Weblog&ContentType=0" width="1" height="1">HPC