You can now freely download by FTP and save the following two online-only PDF chapters of Cloud Computing with the Windows Azure Platform, which have been updated for SQL Azure’s January 4, 2010 commercial release:

HTTP downloads of the two chapters are available for download at no charge from the book's Code Download page.

Tip: If you encounter articles from MSDN or TechNet blogs that are missing screen shots or other images, click the empty frame to generate an HTTP 404 (Not Found) error, and then click the back button to load the image.

In this posting we provide an overview of the Windows Azure Storage architecture to give some understanding of how it works. Windows Azure Storage is a distributed storage software stack built completely by Microsoft for the cloud.

3 Layer Architecture

The storage access architecture has the following 3 fundamental layers:

Front-End (FE) layer – This layer takes the incoming requests, authenticates and authorizes the requests, and then routes them to a partition server in the Partition Layer. The front-ends know what partition server to forward each request to, since each front-end server caches a Partition Map. The Partition Map keeps track of the partitions for the service being accessed (Blobs, Tables or Queues) and what partition server is controlling (serving) access to each partition in the system.

Partition Layer – This layer manages the partitioning of all of the data objects in the system. As described in the prior posting, all objects have a partition key. An object belongs to a single partition, and each partition is served by only one partition server. This is the layer that manages what partition is served on what partition server. In addition, it provides automatic load balancing of partitions across the servers to meet the traffic needs of Blobs, Tables and Queues. A single partition server can serve many partitions.

Distributed and replicated File System (DFS) Layer – This is the layer that actually stores the bits on disk and is in charge of distributing and replicating the data across many servers to keep it durable. A key concept to understand here is that the data is stored by the DFS layer, but all DFS servers are (and all data stored in the DFS layer is) accessible from any of the partition servers.

These layers and a high level overview are shown in the below figure:

Here we can see that Front-End layer takes incoming requests, and a given front-end server can talk to all of the partition servers it needs to in order to process the incoming requests. The partition layer consists of all of the partition servers, with a master system to perform the automatic load balancing (described below) and assignments of partitions. As shown in the figure, each partition server is assigned a set of object partitions (Blobs, Entities, Queues), The Partition Master constantly monitors the overall load on each sever as well the individual partitions, and uses this for load balancing. Then the lowest layer of the storage architecture is the Distributed File System layer, which stores and replicates the data, and all partition servers can access any of the DFS severs.

Lifecycle of a Request

To understand how the architecture works, let’s first go through the lifecycle of a request as it flows through the system. The process is the same for Blob, Entity and Message requests:

DNS lookup – the request to be performed against Windows Azure Storage does a DNS resolution on the domain name for the object’s Uri being accessed. For example, the domain name for a blob request is “<your_account>.blob.core.windows.net”. This is used to direct the request to the geo-location the storage account is assigned to, as well as to the blob service in that geo-location.

Front-End Server Processes Request – The request reaches a front-end, which does the following:

Get the response from the partition server, and send it back to the client.

Partition Server Processes Request – The request arrives at the partition server, and the following occurs depending on whether the request is a GET (read operation) or a PUT/POST/DELETE (write operation):

GET – See if the data is cached in memory at the partition serve

If so, return the data directly from memory.

Else, send a read request to one of the DFS Servers holding one of the replicas for the data being read.

PUT/POST/DELETE

Send the request to the primary DFS Server (see below for details) holding the data to perform the insert/update/delete.

DFS Server Processes Request – the data is read/inserted/updated/deleted from persistent storage and the status (and data if read) is returned. Note, for insert/update/delete, the data is replicated across multiple DFS Servers before success is returned back to the client (see below for details).

Most requests are to a single partition, but listing Blob Containers, Blobs, Tables, and Queues, and Table Queries can span multiple partitions. When a listing/query request that spans partitions arrives at a FE server, we know via the Partition Map the set of servers that need to be contacted to perform the query. Depending upon the query and the number of partitions being queried over, the query may only need to go to a single server to process its request. If the Partition Map shows that the query needs to go to more than one partition server, we serialize the query by performing it across those partition servers one at a time sorted in partition key order. Then at partition server boundaries, or when we reach 1,000 results for the query, or when we reach 5 seconds of processing time, we return the results accumulated thus far and a continuation token if we are not yet done with the query. Then when the client passes the continuation token back in to continue the listing/query, we know the Primary Key from which to continue the listing/query.

Fault Domains and Server Failures

Now we want to touch on how we maintain availability in the face of hardware failures. The first concept is to spread out the servers across different fault domains, so if a hardware fault occurs only a small percentage of servers are affected. The servers for these 3 layers are broken up over different fault domains, so if a given fault domain (rack, network switch, power) goes down the service can still stay available for serving data. The following is how we deal with node failures for each of the three different layers:

Front-End Server Failure – If a front-end server becomes unresponsive, then the load balancer will realize this and take it out of the available servers that serve requests from the incoming VIP. This ensures that requests hitting the VIP get sent to live front-end servers that are waiting to process the request.

Partition Server Failure – If the storage system determines that a partition server is unavailable, it immediately reassigns any partitions it was serving to other available partition servers, and the Partition Map for the front-end servers is updated to reflect this change (so front-ends can correctly locate the re-assigned partitions). Note, when assigning partitions to different partition servers no data is moved around on disk, since all of the partition data is stored in the DFS server layer and accessible from any partition server. The storage system ensures that all partitions are always served.

DFS Server Failure – If the storage system determines a DFS server is unavailable, the partition layer stops using the DFS server for reading and writing while it is unavailable. Instead, the partition layer uses the other available DFS servers which contain the other replicas of the data. If a DFS Server is unavailable for too long, we generate additional replicas of the data in order to keep the replicas at a healthy state for durability.

Upgrade Domains and Rolling Upgrade

A concept orthogonal to fault domains is what we call upgrade domains. Servers for each of the 3 layers are spread evenly across the different fault domains, and upgrade domains for the storage service. This way if a fault domain goes down we lose at most 1/X of the servers for a given layer, where X is the number of fault domains. Similarly, during a service upgrade at most 1/Y of the servers for a given layer are upgraded at a given time, where Y is the number of upgrade domains. To achieve this, we use rolling upgrades, which allows us to maintain high availability when upgrading the storage service.

We upgrade a single domain at a time for our storage service using rolling upgrades. A key part for maintaining availability during upgrade is that before upgrading a given domain, we proactively offload all the partitions being served on partition servers in that upgrade domain. In addition, we mark the DFS servers in that upgrade domain so they are not used while the upgrade is going on. This preparation is done before upgrading the domain, so that when we upgrade we reduce the impact on the service to maintain high availability.

After an upgrade domain has finished upgrading we allow the servers in that domain to serve data again. In addition, after we upgrade a given domain, we validate that everything is running fine with the service before going to the next upgrade domain. This process allows us to verify production configuration, above and beyond the pre-release testing we do, on just a small percentage of servers in the first few upgrade domains before upgrading the whole service. Typically if something is going to go wrong during an upgrade, it will occur when upgrading the first one or two upgrade domains, and if something doesn’t look quite right we pause upgrade to investigate, and we can even rollback to the prior version of the production software if need be.

Now we will go through the lower to layers of our system in more detail, starting with the DFS Layer.

DFS Layer and Replication

Durability for Windows Azure Storage is provided through replication of your data, where all data is replicated multiple times. The underlying replication layer is a Distributed File System (DFS) with the data being spread out over hundreds of storage nodes. Since the underlying replication layer is a distributed file system, the replicas are accessible from all of the partition servers as well as from other DFS servers.

The DFS layer stores the data in what are called “extents”. This is the unit of storage on disk and unit of replication, where each extent is replicated multiple times. The typical extent sizes range from approximately 100MB to 1GB in size.

When storing a blob in a Blob Container, entities in a Table, or messages in a Queue, the persistent data is stored in one or more extents. Each of these extents has multiple replicas, which are spread out randomly over the different DFS servers providing “Data Spreading”. For example, a 10GB blob may be stored across 10 one-GB extents, and if there are 3 replicas for each extent, then the corresponding 30 extent replicas for this blob could be spread over 30 different DFS servers for storage. This design allows Blobs, Tables and Queues to span multiple disk drives and DFS servers, since the data is broken up into chunks (extents) and the DFS layer spreads the extents across many different DFS servers. This design also allows a higher number of IOps and network BW for accessing Blobs, Tables, and Queues as compared to the IOps/BW available on a single storage DFS server. This is a direct result of the data being spread over multiple extents, which are in turn spread over different disks and different DFS servers, and because any of the replicas of an extent can be used for reading the data.

For a given extent, the DFS has a primary server and multiple secondary servers. All writes go through the primary server, which then sends the writes to the secondary servers. Success is returned back from the primary to the client once the data is written to at least 3 DFS servers. If one of the DFS servers is unreachable when doing the write, the DFS layer will choose more servers to write the data to so that (a) all data updates are written at least 3 times (3 separate disks/servers in 3 separate fault domains) before returning success to the client and (b) writes can make forward progress in the face of a DFS server being unreachable. Reads can be processed from any up-to-date extent replica (primary or secondary), so reads can be successfully processed from the extent replicas on its secondary DFS servers.

The multiple replicas for an extent are spread over different fault domains and upgrade domains, therefore no two replicas for an extent will be placed in the same fault domain or upgrade domain. Multiple replicas are kept for each data item, so if one fault domain goes down, there will still be healthy replicas to access the data from, and the system will dynamically re-replicate the data to bring it back to a healthy number of replicas. During upgrades, each upgrade domain is upgraded separately, as described above. If an extent replica for your data is in one of the domains currently being upgraded, the extent data will be served from one of the currently available replicas in the other upgrade domains not being upgraded.

A key principle of the replication layer is dynamic re-replication and having a low MTTR (mean-time-to-recovery). If a given DFS server is lost or a drive fails, then all of the extents that had a replica on the lost node/drive are quickly re-replicated to get those extents back to a healthy number of replicas. Re-replication is accomplished quickly, since the other healthy replicas for the affected extents are randomly spread across the many DFS servers in different fault/upgrade domains, providing sufficient disk/network bandwidth to rebuild replicas very quickly. For example, to re-replicate a failed DFS server with many TBs of data, with potentially 10s of thousands of lost extent replicas, the healthy replicas for those extents are potentially spread across hundreds to thousands of storage nodes and drives. To get those extents back up to a healthy number of replicas, all of those storage nodes can be used to (a) read from the healthy replicas, and (b) write another copy of the lost replica to a random node in a different fault/upgrade domain for the extent. This recovery process allows us to leverage the available network/disk resources across up to all of the nodes in the storage service to potentially re-replicate a lost storage node within minutes, which is a key property to having a low MTTR in order to prevent data loss.

Another important property of the DFS replication layer is checking and scanning data for bit rot. All data written has a checksum (internal to the storage system) stored with it. The data is continually scanned for bit rot by reading the data and verifying the checksum. In addition, we always validate the internal checksum when reading the data for a client request. If an extent replica is found to be corrupt by one of these checks, then it is re-replicated using one of the valid replicas in order to bring the extent back to healthy level of replication.

Geo-Replication

Windows Azure Storage provides durability by constantly maintaining multiple healthy replicas for your data. To achieve this, replication is provided within a single location (e.g., US South), across different fault and upgrade domains as described above. This provides durability within a given location. But what if a location has a regional disaster (e.g., wild fire, earthquake, etc.) that can potentially affect an area for many miles?

We are working on providing a feature called geo-replication, which replicates customer data hundreds of miles between two locations (i.e., between North and South US, between North and West Europe, and between East and Southeast Asia) to provide disaster recovery in case of regional disasters. The geo-replication is in addition to the multiple copies maintained by the DFS layer within a single location described above. We will have more details in a future blog post on how geo-replication works and how it provides geo-diversity in order to provide disaster recovery if a regional disaster were to occur.

Load Balancing Hot DFS Servers

Windows Azure Storage has load balancing at the partition layer and also at the DFS layer. The partition load balancing addresses the issue of a partition server getting too many requests per second for it to handle for the partitions it is serving, and load balancing those partitions across other partition servers to even out the load. The DFS layer is instead focused on load balancing the I/O load to its disks and the network BW to its servers.

The DFS servers can get too hot in terms of the I/O and BW load, and we provide automatic load balancing for DFS servers to address this. We provide two forms of load balancing at the DFS layer:

Read Load Balancing - The DFS layer maintains multiple copies of data through the multiple replicas it keeps, and the system is built to allow reading from any of the replica copies. The system keeps track of the load on the DFS servers. If a DFS server is getting too many requests for it to handle, partition servers trying to access that DFS server will be routed to read from other DFS servers that are holding replicas of the data the partition server is trying to access. This effectively load balances the reads across DFS servers when a given DFS server gets too hot. If all of the DFS servers are too hot for a given set of data accessed from partition servers, we have the option to increase the number of copies of the data in the DFS layer to provide more throughput. However, the partition layer caches hot data, so hot data is served directly from the partition server cache without going to the DFS layer.

Write Load Balancing – All writes to a given piece of data go to a primary DFS server, which coordinates the writes to the secondary DFS servers for the extent. If any of the DFS servers becomes too hot to service the requests, the storage system will then choose a set of different DFS servers to write the data to.

Why Both a Partition Layer and DFS Layer?

When describing the architecture, one question we get is why do we have both a Partition layer and a DFS layer, instead of just one layer both storing the data and providing load balancing.

The DFS layer can be thought of as our file system layer, it understand files (these large chunks of storage called extents), how to store them, how to replicate them, etc, but it doesn’t understand higher level object constructs nor their semantics. The partition layer is built specifically for managing and understanding higher level data abstractions, and storing them on top of the DFS.

The partition layer understands what a transaction means for a given object type (Blobs, Entities, Messages). In addition, it provides the ordering of parallel transactions and strong consistency for the different types of objects. Finally, the partition layer spreads large objects across multiple DFS server chunks (called extents) so that large objects (e.g., 1 TB Blobs) can be stored without having to worry about running out of space on a single disk or DFS server, since a large blob is spread out over many DFS servers and disks.

Partitions and Partition Servers

When we say that a partition server is serving a partition, we mean that the partition server has been designated as the server (for the time being) that controls all access to the objects in that partition. We do this so that for a given set of objects there is a single server ordering transactions to those objects and providing strong consistency, since a single server is in control of the access of a given partition of objects.

In the prior scalability targets post we described that a single partition can process up to 500 entities/messages per second. This is because all of the requests to a single partition have to be served by the assigned partition server. Therefore, it is important to understand the scalability targets and the partition keys for Blobs, Tables and Queues when designing your solutions (see the upcoming posts focused on getting the most out of Blobs, Tables and Queues for more information).

Load Balancing Hot Partition Servers

It is important to understand that partitions are not tied to specific partition servers, since the data is stored in the DFS layer. The partition layer can therefore easily load balance and assign partitions to different partition servers, since any partition server can potentially provide access to any partition.

The partition layer assigns partitions to partition severs based on each partition’s load. A given partition server may serve many partitions, and the Partition Master continuously monitors the load on all partition servers. If it sees that a partition server has too much load, the partition layer will automatically load balance some of the partitions from that partition server to a partition server with low load.

When reassigning a partition from one partition server to another, the partition is offline only for a handful seconds, in order to maintain high availability for the partition. Then in order to make sure we do not move partitions around too much and make too quick of decisions, the time it takes to decide to load balance a hot partition server is on the order of minutes.

Summary

The Windows Azure Storage architecture had three main layers – Front-End layer, Partition layer, and DFS layer. For availability, each layer has its own form of automatic load balancing and dealing with failures and recovery in order to provide high availability when accessing your data. For durability, this is provided by the DFS layer keeping multiple replicas of your data and using data spreading to keep a low MTTR when failures occur. For consistency, the partition layer provides strong consistency and optimistic concurrency by making sure a single partition server is always ordering and serving up access to each of your data partitions.

Brad Calder

This is the first comprehensive explanation of Windows Azure storage that I’ve seen to date. However, the post could have used a few more diagrams, especially of fault and upgrade domains.

CloudBerry Explorer makes managing files in Microsoft Azure Blob Storage EASY. By providing a user interface to Microsoft Azure Blob Storage accounts, and files, CloudBerry lets you manage your files on cloud just as you would on your own local computer.

Windows Azure

Windows Azure is a cloud services operating system that serves as the development, service hosting and service management environment for the Windows Azure Platform. Windows Azure provides developers with on-demand compute and storage to host, scale, and manage Web applications on the Internet through Microsoft data centers.

The cloud wants smart devices. It also wants Web access, and native applications that take advantage of the unique capabilities of user's laptops and desktop machines. The cloud wants much of us.

Once upon a time, the rush was on to produce native Windows applications, then it became "you need a web strategy", and then "you need a mobile strategy". Therefore it is natural to sigh and try to ignore it when told "you need a cloud strategy". However, one of the great advantages of Microsoft's development technologies is their integration across a wide range of devices connected to the cloud: phones and ultra-portable devices, laptops, desktops, servers, and scale-on-demand datacenters. For storage, you can use either the familiar relational DB technology that SQL Azure offers, or the very scalable but straightforward Azure Table Storage capability. For your user interfaces, if you write in XAML you can target WPF for Windows or Silverlight for the broadest array of devices. And for your code, you can use modern languages like C# and, shortly, Visual Basic.

Trying to simultaneously tackle a phone app, a web app, and a native Windows app is a little intimidating the first time, but the surprising thing is how easy this task becomes with Visual Studio 2010 and .NET technologies.

In this article, we don't want to get distracted by a complex domain, so we're going to focus on a simple yet functional "To Do" list manager. Figure 1 shows our initial sketch for the phone UI, using the Panorama control. Ultimately, this To Do list could be accessed not only by Windows Phone 7, but also by a browser-based app, a native or Silverlight-based local app, or even an app written in a different language (via Azure Table Storage's REST support).

Figure 1.

Enterprise developers may be taken aback when they learn that Windows Phone 7 does not yet have a Microsoft-produced relational database. While there are several 3rd party databases that can be used, those expecting to use SQL Server Compact edition are going to be disappointed.

Having said this, you can access and edit data stored in Windows Azure from Windows Phone 7. This is exactly what this article is going to concentrate on: creating an editable Windows Azure Table Storage service that works with a Windows Phone 7 application. [Emphasis added.]

Join the Windows Phone 7 Developer Program today! It's free and gives you access to expert technical support, marketing benefits, and a $99 Marketplace rebate if you develop two or more apps.

Installing the ToolsYou will need to install the latest Windows Phone Developer Tools (this article is based on the September 2010 Release-To-Manufactures drop) and Windows Azure tools. Installation for both is very easy from within the Visual Studio 2010 environment: the first time you go to create a project of that type, you will be prompted to install the tools.

Download the OData Client Library for Windows Phone 7 Series CTP. At the time of writing, this CTP is dated from Spring of 2010, but the included version of the System.Data.Services.Client assembly works with the final bits of the Windows Phone 7 SDK, so all is well.

You do not need to run Visual Studio 2010 with elevated permissions for Windows Phone 7 development, but in order to run and debug Azure services locally, you need to run Visual Studio 2010 as Administrator. So, although you should normally run Visual Studio 2010 with lower permissions, you might want to just run it as Admin for this entire project. The Windows Phone 7 SDK does not support edit-and-continue, so you might want to turn that off as well.

While you are installing these components, you might as well also add the Silverlight for Windows Phone Toolkit. This toolkit provides some nice controls, but it is especially useful because it provides GestureService and GestureListener classes that make gesture support drop-dead simple.

Since Windows Phone 7 applications are programmed using Silverlight (or XNA, for 3-D simulations and advanced gaming) we naturally will use the Model-View-ViewModel (MVVM) pattern for structuring our phone application. So let's start by creating a simple Windows Phone 7 application.

Although Windows Phone 7 (WP7) has support for a Windows Communication Foundation (WCF) client, connecting to Web services that require authentication can be a little quirky.

After working on a project over the holidays putting together a WP7 client to connect to a Web service requiring authentication, this is what I found:

1. NTLM/Kerberos is a no go…

As of today, the WCF client on WP7 does not support NTLM or Kerberos authentication. If you are accessing Web services from a device, you’ll want to make sure that the your server is setup to handle Basic authentication (preferably over SSL).

2. And even Basic authentication needs a helping hand…

Even with Basic authentication enabled on the server, I was noticing that the client was still not sending the right HTTP headers – even when setting the ClientCredentials property on the client and playing with the config file. No matter what I tried, I couldn’t get it to work.

To overcome this however, you can force the Basic authentication details to be written as part of the request header. To do this, first create a method that generates a UTF8 encoded string for your domain, username, and password:

(in the above client is your service reference)

And voila! You should now have Basic authentication headers as part of your client requests, and you can access username/password protected Web services from your WP7 device!

One final tip – during all of this you will likely need to use a tool like Fiddler to trace your requests and responses from the device. If you are having troubles getting Fiddler working with the WP7 emulator, here is a great post that outlines what to do.

From the Beta SDK onwards, it seemed that there was no easy way of getting Fiddler to work with the emulator. For those unfamiliar with Fiddler, in short, it’s a must-have tool if you’re dealing with web requests for your applications. It lets you monitor the HTTP traffic between your computer and the web, which is very useful when it comes to debugging web traffic problems in your apps.

Eric Law has uploaded a post explaining the steps to follow in order to get Fiddler working with the RTM emulator. You can read the instructions below, or see the full post here.

The majority of the session will look at deploying and using SQL Server’s Transparent Data Encryption feature with lives demos, while some of the terminology used in the world of public key encryption and SQL Server’s security hierarchy will also be covered.

Concepts such as keys and certificate will be explored, along with how SQL Server uses service master keys, database master keys and certificates. The main focus, Transparent Data Encryption, will be discussed at a conceptual, use case and technical level, looking at both deployment and some of the challenges TDE brings the DBA!

Many SQL Azure users (including me) are lobbying for TDE in SQL Azure to protect confidential information from access by unauthorized persons, including datacenter workers.

Unless you’ve been living under a social media rock and have not yet joined our Facebook Fan Page or connected through our LinkedIn group, you know all the amazing things we have planned for 2011. One of our biggest and most exciting announcements was our “Analyze This!” healthcare challenge. No this isn’t the movie starring Billy Crystal and Robert Deniro, it’s a challenge for developers to create applications that improve healthcare.

We’ve teamed up with Microsoft’s Windows Azure MarkertPlace [DataMarket] to give web developers and designers a platform that they can use to improve public healthcare and revolutionize the way we use de-identified data to answer some of the most pressing issues facing healthcare in the United States. The HIPAA-compliant clinical dataset contains 5,000 records of patient visits, diagnoses, prescriptions, immunizations and allergies.

Practice Fusion believes in helping doctors practice better, safer medicine every day at their practice. We also believe that healthcare as a whole can be better. By utilizing raw, anonymous data to identify adverse drug reactions and recognizing demographics that are more susceptible to certain ailments, we can begin to examine our healthcare system in a new, more proactive, light. Practice Fusion is creating a place where IT visionaries and healthcare experts can pool their knowledge and confront underlying themes in healthcare such as, how to make patients better, faster, for less money and why people get sick in the first place.

Entries should WOW us. We’re giving you the opportunity to use the data however you’d like — whether you want to depict healthcare trends, find adverse drug effects, chart chronic disease or build an application. Winning team gets $5000 cash prize plus a ticket to the Health 2.0 Conference in San Diego, CA where the winners will demo their prize-winner live on stage.

For more information regarding entry terms and conditions please click here. Submission ends February 28, 2011, so all you data geeks out there, get your engines ready, this contest has a big prize and an even bigger impact on health research overall.

Practice Fusion provides the key information needed to power clinical research for drug interaction studies, disease outbreak monitoring and other public health projects. Top universities across the country use Practice Fusion's research information to power studies providing new insight into healthcare and how it is delivered. All research information utilized by Practice Fusion is de-identified and fully compliant with HIPAA.

Practice Fusion has teamed up with Microsoft Windows Azure Marketplace DataMarket to provide a sample of 5,000 de-identified medical records for research purposes at no cost. These records include details for researching trends in:

The Windows Azure Marketplace DataMarket offers an integrated data store of up-to-date data in one format, which users can search, rent and download. Simply giving users access to a variety of up-to-date datasets isn't enough, however. It is essential that the enterprise understands the stories hidden in the data. By adopting an empirical approach to investigating data, Data Visualisation reveals patterns and insights which may challenge long-held beliefs within an enterprise. However, the messages can be distorted or misunderstood if they are not visualised effectively. Using Tableau, this session focus on the practical implementation of applying data visualisation principles to investigating your Datamarket data, and displaying it appropriately.

One very important concept in cloud computing is the notion of fabric which represents an abstraction layer connecting resources to be dynamically allocated on demand. Fabric knows the what, where, when, why, and how of the resources in cloud. With fabric, all aloud resources become manageable. As far as a cloud application is concerned, fabric is the cloud OS. We use the abstraction, fabric, to shield us from the need to know all the complexities in inventorying, storing, connecting, deploying, configuring, initializing, running, monitoring, scaling, terminating, and releasing resources in cloud.

A virtual machine is physically a virtual hard disk (VHD) file. Not only it is easier to manage files compared with that of working with physical partitions, disks, and machines, but a VHD file can be maintained while offline, i.e. without the need to boot up the OS image installed in the VHD file. Virtual Machine Servicing Tool (VMST) for instance is such tool freely available from Microsoft. Fundamentally, virtualization provides abstractions for manageability and isolations for security to dynamically scale and secure instances of workloads deployed to cloud.

Make no mistake nevertheless. Virtualization is not a destination, but a stepping stone for enterprise IT to transform from then a hardware-dependent and infrastructure-focused deployment vehicle into now and going forward a user-centric and cloud-friendly environment. Although virtualization is frequently motivated for cost saving, the ultimate business benefits are however resulted from deployment flexibility. Facing the many challenges and unknowns ahead brought by Internet, IT needs to transform excising establishments into something strategic and agile. IT needs the ability to manage computing resources, both physical and virtualized, transparently and on a common management platform, while securely deploying applications to authorized users anytime, anywhere and on any devices. For enterprise IT, virtualization is imperative, a critical and strategic step towards a cloud-friendly and cloud-ready environment. There have been many and active discussions on server virtualization, desktop virtualization, Application Virtualization (App-V), and Virtual Desktop Infrastructure (VDI). And many IT organizations have already started consolidating servers and introduced various forms of virtualization into their existing computing environments, as reported in many case studies.

The takeaway is that virtualization should be in every enterprise IT’s roadmap, if not already. And a common management platform with the ability to manage physical and virtualized resources transparently is essential and should be put in place is as soon as possible.

Magic Fabric

The concept of fabric in Microsoft’s implementation in production exhibits itself in the so-called Fabric Controller or FC which is an internal subsystem of Windows Azure. FC, also a distribution point in cloud, inventories and stores images in repository, and:

Manages all Compute and Storage resources

Deploys and activates services

Monitors the health of each deployed service

Provisions the necessary resources for a failed service and re-deploys the service from bare metal, as needed

Here, for example, is shown Azure configuration with 7 VMs managed by FC. In the process, FC first boots the physical machine remotely followed by installing Hypervisor which is a version optimized for Windows Azure, the host VM or root partition, and finally guest VMs. Each guest VM is a Windows Server 2008 Enterprise with IIS 7.0 and .Net Framework , and initialized with a Compute or Storage instance. For FC to control an instance inside of a guest VM, there are Agents in place.

The following schematic illustrates how FC works with Fabric Agents (FA). FA is included in and automatically initialized during the creation of the root partition of every machine in a data center. These machines are running with a Hypervisor specifically optimized for Windows Azure. FAs expose an API letting an instance interact with FC and are then used to manage Guest Agents (GA) running in guest VMs, i.e. child partitions. Fabric is logically formed with the ability for FC to monitor, interact, trust, and instruct FA which then manages GAs accordingly. Behind the scene, FC also makes itself highly available by replicating itself across groups of machines. In short, FC is the kernel of cloud OS and manages both servers and services in the data center.

Windows Azure Application Fabric or AppFabric is another term we have frequently heard and read about in Windows Azure. And many seem assuming AppFabric and FC are the same, similar, or related. This is incorrect. Cause they are not. AppFabric is a middleware offering a common infrastructure to name, discover, expose, secure, and orchestrate web services on the Windows Azure Platform. A number of services AppFabric includes:

In an over simplified description, FC is the kernel of Windows Azure (a cloud OS) and manages the hardware and services in a data center, while AppFabric is a middleware for developing cloud applications. Further discussion of AppFabric and FC is available elsewhere. And a nicely packaged content, Windows Azure Platform Training Kit, is also a great way to learn more more about the technology.

Customers leveraging Windows Azure AppFabric have at their disposal resources designed to make it easier for them to build Single Sign On and centralized authorization into their web applications.

A CodePlex site is dedicated to providing samples and documentation for the Labs flavor of Access Control Service.

As developers currently building apps for Microsoft’s Cloud platform undoubtedly already know, the Labs release of ACS is designed to offer an array of new features compared to what is available in the first version of ACS.

ACS “works with most modern platforms, and integrates with both web and enterprise identity providers. This CodePlex project contains the documentation and samples for the Labs release of ACS,” reveals a description of the CodePlex ACS site.

According to the project, ACS v.2 comes with a range of new features and capabilities, including:

Integration with Windows Identity Foundation (WIF) and tooling

Out-of-the-box support for popular web identity providers including: Windows Live ID, Google, Yahoo, and Facebook

Out-of-the-box support for Active Directory Federation Server v2.0

Support for OAuth 2.0 (draft 10), WS-Trust, and WS-Federation protocols

Microsoft has announced the Community Technology Preview of Microsoft Server Application Virtualization (Server App-V), and the Server Application Virtualization Packaging Tool.

App-V builds on the technology used in client Application Virtualization, allowing for the separation of application configuration and state from the underlying operating system. This separation and packaging enables existing Windows applications, not specifically designed for Windows Azure, to be deployed on a Windows Azure worker role. We can do this in a way where the application state is maintained across reboots or movement of the worker role. This process allows existing, on-premises applications to be deployed directly onto Windows Azure, providing yet more flexibility in how organizations can take advantage of Microsoft’s cloud capabilities.

Microsoft Server Application Virtualization and the Windows Azure VM role are complementary technologies that provide options for migrating your existing Windows applications to Windows Azure. With the Windows Azure VM role, you are taking a full Hyper-V VHD file with the OS and Application installed on it, and copying that Virtual Machine up to Windows Azure. With Server App-V, you are capturing an image of the application with the Server Application Virtualization Sequencer, copying that image up to Windows Azure with the Server Application Virtualization Packaging Tool, and deploying it on a Windows Azure worker role.

Microsoft Server Application Virtualization builds on the technology used in client Application Virtualization, allowing for the separation of application configuration and state from the underlying operating system. This separation and packaging enables existing Windows applications, not specifically designed for Windows Azure, to be deployed on a Windows Azure worker role.

We can do this in a way where the application state is maintained across reboots or movement of the worker role. This process allows existing, on-premises applications to be deployed directly onto Windows Azure, providing yet more flexibility in how organizations can take advantage of Microsoft’s cloud capabilities. Server Application Virtualization delivers:

Application mobility: Server Application Virtualization enables organizations to move their applications from on-premises datacenters to Windows Azure to take advantage of Windows Azure’s scalability and availability. This application mobility provides a unique level of flexibility to organizations as their needs evolve, enabling movement from one environment to another as their business needs dictate without the need to re-compile or rewrite the application.

Simplified deployment: With Server Application Virtualization, organizations are able to virtualize applications once and then deploy these packages as needed. This process creates a method to manage applications, simply and efficiently across their Windows Server® platform or to Windows Azure.

Lower operational costs: By using Server Application Virtualization organizations can gain the lower management benefits of the Windows Azure platform for their existing applications. This is delivered through the virtualized application being deployed on the Windows Azure platform, meaning organizations get the benefit of Windows without the need to manage a Windows Server operating instance or image for that application. With Server Application Virtualization, organizations are able to virtualize applications once and then deploy the packages this process creates, simply and efficiently across their Windows Server® platform or to Windows Azure.

Microsoft Server Application Virtualization converts traditional server applications into state separated "XCopyable" images without requiring code changes to the applications themselves, allowing you to host a variety of Windows 2008 applications on the Windows Azure worker role. The conversion process is accomplished using the Server App-V Sequencer. When the server application is sequenced, the configuration settings, services, and resources that the application uses are detected and stored. The sequenced application can then be deployed via the Server Application Virtualization Packaging Tool to the worker role in Windows Azure as a file.

It is possible that you may encounter the following error when using this tool:

c:>csmanage.exe /list-hosted-services

Could not establish secure connection for SSL/TLS with authority 'management.core.windows.net

Based on above problem details, it is possible that the certificate you are using with CSMANAGE.EXE.CONFIG is a self signed key based certificate. The application may not trust the key used in the CSMANAGE.EXE.CONFIG or their may not be a way for the application to trust the certificate as there may not be a chain to validate or a certificate revocation list.

To solve this problem you can just create a simple certificate as describe below using MAKECERT and solve this problem.

Step 1: Create Certificate in the “My” Certificate store:

1. Open the VS2010 command window to create a certificate (Change the CN value to your word of choice):

Cloud Architecture is becoming more and more relevant in the software industry today. A lot of efforts are becoming less about software and more about cloud software. The whole gamut of cloud technology; platform, service, infrastructure, or platforms on platforms are growing rapidly in number. The days you didn’t need to know what the cloud was are rapidly coming to an end.

As I move further along in my efforts with cloud development, both at application level and services levels, I’ve come to a few conclusions about what I do and do not need. The following points are what I have recently drawn conclusions about, that cause frustration or are drawn from the beauty of cloud architecture.

REST Architecture is fundamental to the web and the fact that the web is continuous in uptime. The web, or Internet, doesn’t go down, doesn’t crash, and is always available in some way. REST Architecture and the principles around it are what enables the web to be this way. The cloud seeks to have the same abilities, functionality, and basically be always on, thus REST Architecture is key to that underpinning.

Core Development

Currently I do most of my development with C# using the .NET Framework. This is great for developing in cloud environments in Windows Azure and Amazon Web Services. The .NET Framework has a lot of libraries that work around, with, and attempt to provide good RESTful Architecture. However, there are also a lot of issues, such as the fact that WCF and ASP.NET at their core aren’t built with good intent against RESTful Architecture. ASP.NET MVC and some of the latest WCF Releases (the out of band stuff) are extensively cleaning this up and I hear that this may be sooner than later. I look forward to have cleaner, bare bones, fast implementations to use for RESTful development in Azure or AWS.

The current standing ASP.NET (Web Forms) Architecture is built well above the level it should be to utilize REST Architecture well. Thus it creates a lot of overhead and unnecessary hardware utilization, especially for big sites that want to follow good RESTful Practice.

WCF on the other hand had some great ideas behind it, but as REST increased in popularity and demanded better use of the core principles the web is founded on, the WCF model has had to bend and give way some of its functional context in order to meet basic REST Architecture.

Anyway, that’s enough for my ramblings right now. Just wanted to get some clear initial thoughts written down around Cloud Architecture needs for the software developer.

Project Overview

The Windows Azure platform is a cloud platform that allows applications to be hosted and run at Microsoft datacenters. It provides a cloud operating system called Windows Azure that serves as the runtime for the applications and provides a set of services that allow development, management and hosting of applications.

With its standards-based and interoperable approach, the Windows Azure platform supports multiple Internet protocols including HTTP, XML, SOAP and REST. As an open platform, the Windows Azure platform allows developers to use multiples languages (.NET, Java, and PHP & other dynamic languages ) to build applications which run on Windows Azure and/or consume any of the Windows Azure platform offerings.

The Windows Azure Companion aims to provide a seamless experience for deploying platform-level components as well as applications on to Windows Azure. The first community preview of the Windows Azure Companion has been tested with PHP runtimes, frameworks, and applications. However, the system may be extended to be used for deploying any component or application that has been tested to run on Windows Azure.

What’s in the Windows Azure Companion?

We are pleased to announce Windows Azure Companion -December 2010 CTP release. This release includes following new features:

Built using Windows Azure SDK 1.3

Uses Full IIS for admin and PHP web sites.

Used standard port #3306 for MySQL or MariaDB. Now you do not need to specify port number in the MySQL client.

SSL support admin and PHP web sites. There are two separate projects for SSL and non-SSL. For admin web site, the SSL port is 8443 and for PHP web site, SSL port is 443. If you want to modify these ports, you need to use the source code, update ServiceDefinition.csdef file and create package.

Exposes WCF service that can be used to programmatically perform operations done by the admin web site. One can automate PHP application installation on Windows Azure. The endpoints of this WCF service is as follows. You need to use same admin credentials to authenticate with this WCF service. SSL support is also available for this WCF service. Base URL for WCF service is:

Without SSL: http://****.cloudapp.net:8081/WindowsAzureVMManager

With SSL: https://****.cloudapp.net:8081/WindowsAzureVMManager

The following packages are available for download in Dec 2010 CTP release.

WindowsAzureCompanion-SmallVM-Dec2010CTP.zip: The zip file contains a ready-to-deploy Windows Azure Cloud Package (.cspkg) and a Service configuration file (.cscfg) file for the Small Windows Azure VM to get you on your way. Once you deploy the application to your Windows Azure compute account, you will have an administration web-site for deploying PHP runtime components, frameworks, and several applications. We expect most users to start with this package.

WindowsAzureCompanionWithHttps-SmallVM-Dec2010CTP.zip: As above, this zip file contains a ready-to-deploy Windows Azure Cloud Package (.cspkg) and a Service configuration file (.cscfg) file for the Small Windows Azure. But in addition, it also enables HTTPS support for admin and PHP web sites. Please make sure to upload certificate to the Windows Azure portal and use same thumbprints in ServiceConfiguration.cscfg file.

WindowsAzureCompanion-Source-Dec2010CTP: Using a data-driven deployment architecture, the Windows Azure Companion Application provides the flexibility that most developers and users need for deploying PHP components and applications. However, there may be some scenarios where editing source-code might be the best way to customize the application. This zip file contains the source-code for the Windows Azure Companion web-site.

For other compute sizes (ExtraSmall, Medium, Large, or ExtraLarge), please use the source code and rebuild the package by modifying the VMSize attribute in .csdef file. Details of the compute instance sizes and related billing information is available here.

Application Architecture

The schematic below illustrates the high-level architecture of the application:

Getting Started

Getting Started with the using the Windows Azure Companion for deploying PHP runtimes, frameworks, and applications is as easy as 1, 2, 3. OK, its more like 1,2,3,4.

Update the ApplicationTitle and ApplicationDescription. This is the text that is shown on the home page of the Windows Azure Companion web-site.

Update the WindowsAzureStorageAccountName and WindowsAzureStorageAccountKey to the appropriate values for your Windows Azure account

Update the AdminUserName and AdminPassword for the Windows Azure Companion web-site

Update the applcation ProductListXmlFeed to point to the URL of the application feed. You may write your feed or to get started check out the feed that Maarten Balliauw has made avaialble (Maarten's application feed). If you decide to write a custom feed, please make sure it is accesible on a public URL. You can also save the feed to a public blob container on your Windows Azure account.

There are several other settings in the serviceconfiguration file, but they are optional. The settings are well documented in the configuration file code comments.

Step4: Deploy the WindowsAzureCompanion.cspkg to your Windows Azure account. Please refer to this walkthrough for step-by-step instructions on how to deploy the cloud package. Since you already have cspkg, you can skip the first section and go directly to the section titled To Select a Project and Create a Compute Service.

Now you can begin installing the runtimes, frameworks, and applications that are available via the application feed! Remember that the admin port of the application is 8080 by default. So the URL for the Windows Azure Companion will be http://<yourdeploymentname>.cloudapp.net:8080

Contributors

Eric Knorr wrote “2010 saw Microsoft make a big grab for the cloud, while big business got serious about building clouds of its own” as a lead to his What you need to know about the year of the cloud article of 12/30/2010 for NetworkWorld’s DataCenter blog:

More than anything, 2010 will be remembered as the year Microsoft jumped into the cloud with both feet. Less obvious, but just as important, was growing clarity around the discrete services offered by public and private clouds -- and who is likely to use them.

The cloud is a matrix of services. The categories along one axis should sound familiar: SaaS (software as a service), where applications are delivered through the browser; IaaS (infrastructure as a service), which mainly offers remote hosting for virtual machines; and PaaS (platform as a service), which offers complete application development and deployment environments. On the other axis is the public vs. the private cloud.

In the public cloud, SaaS showed phenomenal momentum this year. Salesforce.com, the leading SaaS provider, saw its revenue nearly double in 2010 to $2 billion. Microsoft, SAP, and even latecomer Oracle all have SaaS offerings and major SaaS initiatives underway. And Intuit has quietly evolved into a company that enjoys $1 billion in SaaS revenue serving individual and small-business tax and accounting needs.

By contrast, leading IaaS provider Amazon.com was expected to earn just $650 million from its Amazon Web Services business in 2010, according to a Citigroup estimate in April, which also noted that second-place Rackspace would likely come in at a mere $56 million. While revenue numbers for the three main PaaS platforms -- Salesforce's Force.com, Microsoft Windows Azure, and Google App Engine -- are not readily available, by all indications these top three play to a very small market.

Small businesses have always led in consuming public cloud services; we hear more and more anecdotes about startups going nearly "all cloud," from accounting to data backup. Big businesses, on the other hand, seem to have overcome their aversion to SaaS only recently, while their pickup of IaaS and PaaS in the public cloud remains spotty. Just as small businesses tend to rent their office space, corporations prefer to own the building -- servers included.

The rise of the private cloud. Instead of turning to IaaS in the public cloud, many big businesses are looking closely at adopting public cloud technologies and techniques as their own. In other words, they want to build so-called private clouds.

Of course, implementing a private cloud requires real work, which InfoWorld's Cloud Computing columnist, David Linthicum, says will be harder than many IT organizations realize. As a result, many will have trouble going the private cloud route.

Why private rather than public? Here's a classic response from Intel's CIO, Diane Bryant: "I have a very large infrastructure -- I have 100,000 servers in production -- and so I am a cloud. I have the economies of scale, I have the virtualization, I have the agility. For me to go outside and pay for a cloud-based service ... I can't make the total cost of ownership work."

Microsoft's Jobs Blog a while back addressed what that company looks for as "cloud computing skills" in job applicants. The answer? Basically experience with large-scale software-as-a-service (SaaS) projects or infrastructure-as-a-service (IaaS) initiatives.

The hype is over: Adoption of cloud applications and software as a service (SaaS) is for real, and this means jobs for now and the future.

Liz Herbert, principal analyst at Forrester Research, writes at CIO.com that clients are asking increasingly sophisticated questions about SaaS and approaching it much more strategically. But one of the "clouds" might be the fog of confusion over the true definitions of "cloud" and SaaS, and their differences. Both figure prominently in the 2011 predictions piece by Bernard Golden, CEO of consulting firm HyperStratus.

Most chief financial officers think that cloud computing is all about cutting the IT budget so they can drop more profit to the bottom line. That creates a perilous situation for an IT department that could quickly become a shadow of its former self.

Sears posed the question about skills to Apprenda CEO Sinclair Schuller. Apprenda makes the cloud middleware SaaSGrid. Schuller spoke of four areas of skills to comprehend and master:

concurrent programming

building for Web scale

employing high-availability software infrastructure

performance-based architecture

Said Schuller:

Another way to look at SaaS skill sets is to group them in two buckets: industrial and academic. Industrial skills are those that you can to some degree learn on the job but employ knowledge of programming languages, of the framework du jour and understanding how to correct bugs. Academic skill sets are higher-level skills with knowledge of how to design systems for high availability and scale, many of which came out of technology research communities, and have experience in areas like memory management systems, thread scheduling and many skills aligned with operating systems.

He says a paradigm shift is under way for software developers, as Sears explains:

That shift means understanding multitenancy and efficient distribution at the data level. It also means knowing how to design systems that weight performance mechanics against cost and how to optimize for speed and service-level requirements.

Schuller said certifications are nice to have, but he'd never hire a person solely on that, adding:

The industry does need a standards body, and those standards need to be rigorous, deep and define competency levels.

The Economist claimed “Computing services are both bigger and smaller than assumed” in a deck for its Tanks in the cloud article in the 12/29/2010 print edition:

CLOUDS bear little resemblance to tanks, particularly when the clouds are of the digital kind. But statistical methods used to count tanks in the second world war may help to answer a question that is on the mind of many technology watchers: How big is the computing cloud?

This is not just a question for geeks. Computing clouds—essentially digital-service factories—are the first truly global utility, accessible from all corners of the planet. They are among the world’s biggest energy hogs and thus account for a lot of carbon dioxide emissions. More happily, they allow firms in developing countries to leapfrog traditional information technology (IT) and benefit from advanced computing services without having to build expensive infrastructure. …

The “cloud of clouds” has three distinct layers. The outer one, called “software as a service” (SaaS, pronounced sarse), includes web-based applications such as Gmail, Google’s e-mail service, and Salesforce.com, which helps firms keep track of their customers. This layer is by far the easiest to gauge. Many SaaS firms have been around for some time and only offer such services. In a new study Forrester Research, a consultancy, estimates that these services generated sales of $11.7 billion in 2010.

Going one level deeper, there is “platform as a service” (PaaS, pronounced parse), which means an operating system living in the cloud. Such services allow developers to write applications for the web and mobile devices. Offered by Google, Salesforce.com and Microsoft, this market is also fairly easy to measure, since there are only a few providers and their offerings have not really taken off yet. Forrester puts revenues at a mere $311m. …

The tank angle refers to statistical techniques used by the allies in WWII to estimate monthly production of German tanks by analyzing the serial numbers of captured tanks.

Using this approach, Guy Rosen, a blogger, and Cloudkick, a San Francisco start-up which was recently acquired by Rackspace, have come up with a detailed estimate of the size of at least part of Amazon’s cloud. [Link added.] Mr Rosen decrypted the serial numbers of Amazon’s “virtual machines”, the unit of measurement for buying computing power from the firm. Alex Polvi, the founder of Cloudkick, then used these serial numbers to calculate the total number of virtual computers plugged in every day. This number is approaching 90,000 for Amazon’s data centres on America’s East Coast alone (see chart).

The results suggest that Amazon’s cloud is a bigger business than previously thought. Randy Bias, the boss of Cloudscaling, a IT-engineering firm, did not use these results when he put Amazon’s annual cloud-computing revenues at between $500m and $700m in 2010. And in August UBS, an investment bank, predicted that they will total $500m in 2010 and $750m in 2011. …

The article concludes:

At any rate, the cloud is not simply “water vapour”, as Larry Ellison, the boss of Oracle, a software giant, has deflatingly suggested. One day the cloud really will be big. Given a little more openness, more people might actually believe that.

“Necessity is the mother of invention”

The Windows Azure Platform currently offers the ability to have an administrator and 10 co-administrators associated with every account (Thanks to Steve Marx’s help to figure out that number), which introduces a limitation when 11 or more team members want to share the same account. In this post, I’m going to illustrate the different ways to avoid this limitation, and I’m pretty confident that at some point in the near future the platform will support a much more sophisticated user management interface.

Some rules of thumb: Treat the account as your online banking credentials. Whoever is paying the bill should frequently keep an eye on the consumption for any suspicious activity. Periodically renew the password or refresh the management certificates, and remember that the more people that share a secret, the less secret it is.

Even though we all know the rules, sometimes (in some cases, many times) we don’t follow them. So here are the different ways I’ve used to share the same account.

Share Co-Admin Credentials

One of the light overhead ways to share the account is to create a single LiveId account and give the credentials to every team member. This way everyone can login to the Windows Azure Portal as a co-administrator. To add an additional layer of safety, you can periodically change the password, this way you can avoid cases where someone left the group and still has the password or probably someone engraved the password in their favorite pub’s bathroom on a drunk night (yes ladies, you would be surprised with what’s in there.)

Share A Jump Box

A jump box could be a dedicated machine or virtual machine where you save the co-admin credentials in the browser, then anyone who needs to deploy a service will have to login to the jump box with some operating system credentials. This technique is painful (imagine the process: package your service –> remote login into the machine –> copy the package and config file –> deploy through the portal) but more secure because the jump box could live under the corporate network and the credentials of the Windows Azure account are not shared with the users. Ohh by the way, only one user can be logged in to the box at a time, so depends on your team size, you might need more jump boxes.

I followed this process for like a couple of weeks until the team wide bug bash day arrived where I had to apply few fixes and deploy multiple times, and trust me I wasn’t in the best mood afterwards, the extra step of copying the files and deploying through the portal felt like ages.

Share A Certificate And Subscription ID

(My favorite and most practical way)

At the moment, the Windows Azure Portal allows you to deploy a maximum of 10 certificates to your account, which will allow you to use the service management APIs to manage your account. You can create a single password protected certificate and share it with everyone on your team or you can have multiple certificates for different employees status (for instance, you can have a certificate for your full time employees, another one for contingent staff, another temporary certificate for developers who are not on the services side of the house but decided to experiment with cloud based services part of their out of the box projects. Once a user has the certificate installed on their machines and the associated Azure account subscription ID, they’ll be able to use the sweet visual studio publish button to package and deploy their service, or any of their favorite Azure Service Management Tools

Finalement (finally in French)

Sharing a certificate and subscription ID is my favorite approach because I mostly use the client side tools to deploy and manage my applications.

Please share any other ways your team applied and what you like and don’t like about them.

Guy is a software developer at Microsoft who’s mostly focused on implementing cloud based services around the Windows Azure platform.

Today I received my credentials for the NIST Cloud Computing Collaboration Site.

"The National Institute of Standards and Technology (NIST) has been designated by Federal Chief Information Officer Vivek Kundra to accelerate the federal government’s secure adoption of cloud computing by leading efforts to develop standards and guidelines in close consultation and collaboration with standards bodies, the private sector, and other stakeholders. Computer science researchers at NIST are working on two complementary efforts to speed the government’s quick and secure adoption of cloud computing.

NIST's long term goal is to provide thought leadership and guidance around the cloud computing paradigm to catalyze its use within industry and government. NIST aims to shorten the adoption cycle, which will enable near-term cost savings and increased ability to quickly create and deploy safe and secure enterprise applications. NIST aims to foster cloud computing systems and practices that support interoperability, portability, and security requirements that are appropriate and achievable for important usage scenarios." Membership for participation on the this Cloud Computing Wiki site is open to the public. If you want to contribute content to this wiki, please go to NIST Cloud Computing Program website, read the page carefully and follow the instructions. The current working group meeting schedule is provided below.

The National Institute of Standards and Technology issued two special publications Wednesday: SP 800-119, Guidelines for the Secure Deployment of IPv6 and SP 800-135, Recommendation for Application-Specific Key Derivation Functions.

SP 800-119 aims to help with the deployment of the next generation Internet protocol, IPv6. It describes and analyzes IPv6's new and expanded protocols, services and capabilities, including addressing, domain name system, routing, mobility, quality of service, multihoming and Internet protocol security. For each component, the publication provides a detailed analysis of the differences between IPv4 - the existing Net protocol - and the newer IPv6, the security ramifications and any unknown aspects. The publication characterizes new security threats posed by the transition to IPv6 and provides guidelines on IPv6 deployment, including transition, integration, configuration, and testing. It also addresses more recent significant changes in the approach to IPv6 transition.

The current state of cloud and API standards is almost an exact match for early SOA and Web services standards, and we expect the standards movement will follow a very similar trend. Hopefully, the cloud standards groups will stand a better chance by learning from the mistakes and successes of the Web services standards.

The discussion of cloud standards at Cloud Camp Boston started by asking the question “What do we want to standardize?” As we looked at standards we found that there are three attributes of apparent concern. These include “API lock-in” (a similar concept to vendor lock-in), migration issues, and the richness or functionality of an API.

One interesting problem with setting standards (For APIs and services both) is the granularity of the work you’re standardizing. Some APIs have a very limited scope and effect only a single application with a single purpose. Others address a broad range of applications with any number of different purposes.

One example of scale issues effecting functionality is libcloud which simplifies the process of integrating with multiple popular cloud server providers. Its methods rely on using the “lowest common denominator” to ensure compatibility with all the providers it supports. In other words, it provides for the basic functionality that all providers share at the expense of losing the specifics of each system that help each one excel in their particular niche. One potential pitfall of standards could be the “dumbing-down” of new cloud applications built for mass appeal.

There are also virtualization issues in the cloud. It would be wonderful if there was a single best way to optimize the automatic scaling of applications. But there are complications involved in creating “standardized triggers”. At what point should a system be triggered to bring more servers on line? At what point should the system know to let servers go? What metrics should be involved in the calculations?

In every case, the answer is a resounding “it depends.” Different applications have different needs. If we want to ensure five nines availability, we might be willing to suffer a little latency to avoid the risk of down time. And speed versus availability is just one dichotomy to resolve. Taken together with other issues around hardware, scripting languages and anything else developers argue over, it means once again you can’t be everything to everyone.

But there are areas where the cloud is actually fairly well standardized already, even if only in practice. For example, the infrastructure APIs for service providers are seen by some as the shining example of cloud standards.

In fact, the idea of the cloud came from telecommunications service providers as they transitioned from point-to-point services to VPNs. The cloud was their way of denoting where the clients’ responsibilities ended and the providers’ responsibilities took over.

As 2010 draws to a close we're taking a look at a few cloud startups that show promise and that we haven't covered on ReadWriteCloud.

RethinkDB is a MySQL data store optimized for solid state drives. Solid-state drives do away with moving parts and are extremely low-latency. Most database stores are designed for traditional hard drives and assume relatively high-latency. RethinkDB aims to allow database developers to take full advantage of the performance benefits of solid-state drives.

RethinkDB features:

Append-only algorithms

Lock-free concurrency

Live schema changes

Instantaneousness recovery from power failure

Simple replication

Hot backups

From cloud-based databases to NoSQL, developers are looking for new ways to manage the deluge of data they must now manage. RethinkDB is doing some important work towards building the database of the future.

RethinkDB is based in Mountain View, CA and received seed funding from Y Combinator in 2009. It's other investors include Highland Capital Partners, Avalon Ventures, Andreessen Horowitz, and Charles River Ventures.

Klint’s first “Startup to Watch” was ClearDB (scroll down), which is more cloud-oriented than RethinkDB. Disk latency is a minor issue compared to the latency of Internet connections to cloud databases.

Everybody's head was in the cloud, or so it seemed in 2010. Both well established and startup vendors developed solutions and strategies designed to extend their reach or provide entry into this booming market. After all, IDC estimated the cloud market will be worth $55 billion by 2014; Gartner predicted the cloud world could be valued at $148 billion at that time, in part because Gartner included Google AdWords advertising revenue in its figures, said Gregor Petri, adviser, lean IT and cloud computing, at CA Technologies.

Whether cloud computing reaches $55 billion, $148 billion, or a completely different figure, all research firms appear to agree that public and private sector organizations increasingly are adopting the technology. With each proven test site, cloud implementations also are expanding in scope and complexity, as businesses depend on the technology to support their multi-national operations.

Cisco's Connected World Report is a three-part study that examines the needs and expectations of an increasingly mobile and distributed workforce. The study, which involved surveys of 2,600 workers and IT professionals in 13 countries, also reports on how IT professionals are managing the security and data governance challenges involved in this transformation.

Part III of the study, released Dec. 8, examines the evolution of data center, virtualization and cloud computing technologies as businesses adapt to the changing nature of work.

Part II of the study reveals a disconnect between IT policies and workers, especially as employees strive to work in a more mobile fashion and utilize numerous devices, social media, and new forms of communication such as video.

In Part I of the study, three of five workers around the world said they do not need to be in the office anymore to be productive. In fact, their desire to be mobile and flexible in accessing corporate information is so strong that the same percentage of workers would choose jobs that were lower-paying but had leniency in accessing information outside of the office over higher salaried jobs that lacked flexibility.

The dual Web role application has been running in Microsoft's South Central US (San Antonio) data center since September 2009. I believe it is the oldest continuously running Windows Azure application.

About Me

I'm a Windows Azure Insider, a retired Windows Azure MVP, the principal developer for OakLeaf Systems and the author of 30+ books on Microsoft software. The books have more than 1.25 million English copies in print and have been translated into 20+ languages.

Full disclosure: I make part of my livelihood by writing about Microsoft products in books and for magazines. I regularly receive free evaluation software from Microsoft and press credentials for Microsoft Tech•Ed and PDC. I'm also a member of the Microsoft Partner Network.