I've been making a point of telling my live audiences about AWS CloudFormation lately. Many large-scale AWS customers are starting to appreciate the fact that they can describe and instantiate entire application stacks using parameterized templates (see my original CloudFormation blog post for more info), allowing them to create a repeatable process around it.

Today we are adding some powerful new features to CloudFormation to give you additional control over the resource creation process. We have also added some new application bootstrapping features that will give you full control of the configuration of each EC2 instance launched by a template.

Here's what is new:

Template Composition - Your CloudFormation templates can now reference other templates by URL. This looks like a parameterized function call in a procedural programming language (although CloudFormation templates are declarative). You can use this feature to create a series of reusable templates, each with a specific responsibility, such as installing a particular package or setting up an architectural component such as a load balancer or a database.

IAM Integration - Your CloudFormation templates can now specify the creation of IAM (Identity and Access Management) users, groups, and the associated policies. Existing CloudFormation functions provide you with access to attributes of the users, including access keys and secret access keys. Like all other resources created by a CloudFormation template, the users, groups, and policies are associated with the application stack and will be deleted when the stack is deleted, unless you explicitly choose to retain them.

Stack Updating - You can now update a running CloudFormation stack by supplying an updated template. CloudFormation will carefully update the resources in the stack to match the new template. Resources that are unchanged will be left as-is. Resources with changed attributes will be updated "in-place" if possible, and replaced only as a last resort. CloudFormation supports updating of the following resource types: AutoScaling Groups and Launch Configurations, CloudWatch Alarms, EC2 Instances, Load Balancers, DB Instances, and Route 53 RecordSets. Read more about stack updating.

Application Bootstrapping - You now have a wide variety of options to bootstrap (install and configure) the applications on each EC2 instance that you launch. You can continue to create "golden images" -- static AMIs that contain the OS and the application, all pre-configured and ready to go. Or, you can choose between any of the following four new options:

Running a shell script at boot time using the CloudInit package from Canonical. The shell script is passed to the instance using EC2's user data facility.

Encoding configuration meta-data in the CloudFormation template and accessing the metadata using a set of CloudFormation helper scripts running on the instance. You can use the cfn-init script to download and unpack archive fles, install packages, create and populate files, and configure services.

If you have been statically configuring your instances (or your physical servers), the move to a more dynamic, declarative model is a pretty big change. My advice: Spend your time learning to do this the right way now, and then benefit from it for years to come! Learning how to set up servers dynamically is at least as worthwhile as learning a new programming language or a new text editor!

We've just added an edge location in Brazil (number 20 to be precise) to Amazon CloudFront and Amazon Route 53. This is our first edge location in South America.

The new location will speed up references to static and streamed content that are made from locations in South America, and will also accelerate the resolution of DNS queries that originate from within the area.

Our customers have put CloudFront to use in a variety of ways. Check out our case studies from the likes of IMDB, PBS, Playfish, Second Life, and Virgin Atlantic to learn more. It is very easy to get started with CloudFront. Once you have done so, your content will be available more quickly, your application will be more responsive, and your users will be happier!

Hackathons are participatory event. You don't show up to watch or to hang out; you show up to lead or to join a team of other developers to imagine, create, and build something cool and useful in the course of a single day. This may sound impossibly ambitious. It is not. You can get a lot done when you can ignore your inbox and simply focus on the task at hand. The ability to tap in to high-level services certainly helps, as does the fact that everyone shows up expecting to work.

We'll hack from 8:30 AM to 5:30 PM on Sunday, October 23rd. The day will be structured as follows:

I am also looking for an additional sponsor or two. The ideal sponsor would have a product or service that would be of interest to someone who is working with Big Data. Please get in touch with me (email to awseditor@amazon.com) and we can discuss the details.

I never get tired of writing posts that announce price decreases on the various AWS services!

Today, we are reducing the price to host a set of DNS records for a domain (which we call a hosted zone) using Amazon Route 53. Here's the new pricing structure:

$0.50 per hosted zone per month for the first 25 zones per month.

$0.10 per hosted zone per month for all additional zones.

The original pricing was $1.00 per hosted zone per month, with no volume discounts.

The per-query pricing ($0.50 per million for the first billion queries per month, and $0.25 per million afterward) has not changed.

Add it all up, multiply it all out, and you will see savings of between 50% and 90% when compared with the original prices. The AWS Simple Monthly Calculator's example shows that if you managed 100 hosted zones, your bill will drop from $100 to $20. We enjoy making it easier and more cost-effective for our customers to use AWS and this is one more step in that direction.

In case you forgot, the Amazon Route 53 Service Level Agreement specifies 100% availability over the course of a month. Over the last couple of months we've seen a number of large-scale customers come on board.

With 19 locations world-wide, Route 53 is able to provide low-latency, high-quality to all of your customers, regardless of their location.

We introduced the Amazon Linux AMI in beta form about a year ago with the goal of providing a simple, stable, and secure Linux environment for server-focused workloads. We've been really happy with the adoption we've seen so far, and we continue to improve the product and further integrate it with other Amazon Web Services tools.

Today we are zapping the "beta" tag from the Amazon Linux AMI, and moving it to full production status. We are also releasing a new version (2011.09) of the AMI with some important new features. Here's a summary:

The Message of the Day now tells you when updates to installed packages are available.

While the AMI’s default configuration is set to provide a smooth upgrade path from release-to-release, you can now lock the update repositories to a specific version to inhibit automatic updates to newer releases.

Security updates are automatically applied on the initial boot of the AMI. This behavior can be modified by passing user data into the AMI with cloud-init.

Puppet has been added to the repositories and is available for system configuration management.

Access to the Extra Packages for Enterprise Linux (EPEL) repository is configured, though not enabled by default. EPEL provides additional packages beyond those shipped in the Amazon Linux AMI repositories, but these third party packages are not supported.

The cfn-init daemon is installed by default to simplify CloudFormation configuration.

Over the summer months, we'd like to share a few stories from startups around the world: what are they working on and how they are using the cloud to get things done. Today, we're profiling Filter Squad from Perth, Australia!

Filter Squad is a startup focused on building “apps that find what you like”, according to CTO Stuart Hall. They began with a #1 selling iPad/iPhone app called Discovr Music in January 2011 and expanded the discovery product suite to include Discovr App in June 2011, which has been a #1 category application in 17 countries. As the name implies, Discovr Music makes it easy for users to find music they like based on their preferences, while Discovr App recommends apps the user might like based on the ones you're already using. “We have been extremely happy with AWS and we also plan to use it for our future products. We are big fans of products such as Amazon RDS and the Elastic Load Balancer to give a complete app scaling solution with Amazon EC2”, says Stuart.

Take a look at the Discovr Music app review from Fox News:

AWS & Lean Startups

Because we are a small, lean team, we were looking for a hosting solution that was going to be easy for us to setup, be reliable, and be easy to scale up and down throughout our product iterations. We looked at a large number of providers, but AWS stood out immediately for a number of reasons:

Low maintenance

Easy to scale

Simple to setup

Provided good redundancy

We couldn't find anyone else who could match the AWS products and price. The number of other large, successful companies also using the service was very reassuring.

Building a Native iPhone/iPad App on AWS

Native mobile apps often need server-side components to create a rich user experience. For our Discovr Apps and Discover Music apps, we have used the following AWS products:

Amazon EC2 - because we had no idea of the market reaction to the application when we launched, flexibility in adding and removing virtual servers based on demand was key.

Amazon RDS - we needed a database that would also be easy to scale and be easy to maintain. Amazon RDS provided easy scaling, easy replication for slave instances, and a system where minor software updates are handled entirely by AWS.

Amazon S3 - S3 provides a great and cheap way to host static resources, one with which we had worked before and found ideal for our use case.

Amazon Elastic Load Balancer - the load balancer is provided straight out of the box: it doesn't require any installing and it needs very little configuration. The load balancer provides built in health checks and takes out instances that are not behaving. Elastic load balancing has been faultless since we launched.

Caching: the only thing missing was a caching solution, which AWS has since launched and we will be soon moving to. This was also a big consideration, the pace at which AWS are iterating and improving their service matches our philosophy to application development.

We are also big fans of New Relic for monitoring our AWS instance performance.

Scaling up Ruby on Rails with AWS

We use Ruby on Rails server side, Objective C, and Java for client side. More details of our stack, including our architecture and test data, can be seen detailed on our blog.

Words of Wisdom for Other Startups:

Understand that you can do it from anywhere, you don’t have to be based in Silicon Valley, or even a big city. With the help of the internet and web services such as the AWS cloud, anyone can deliver great products from anywhere in the world.

For example we’re based in Perth, Australia. It’s a five hour flight to Sydney and our hometown is most definitely not the tech capital of the world! To sum up:

Build a great product, then don't forget to market it!

Treat your customers like precious gold.

Make it easy for your customers to talk to you and listen to what they say.

8 Days Left to Enter Your Startup in the AWS Start-up Challenge!This year's AWS Start-up Challenge is a worldwide competition with prizes at all levels, including up to $100,000 in cash, AWS credits, and more for the grand prize winner. Learn more and enter today!

You can also follow @AWSStartups on Twitter for startup-related updates.

In honor of today's Facebook Developer Conference, I'd like to recognize the success of our existing Facebook app developers and invite even more developers to kick-start their next Facebook app project with Amazon Web Services.

Quick Numbers We crunched some numbers and found out that 70% of the 50 most popular Facebook apps leverage one or more AWS services. Many of their developers rely on AWS to provide them with compute, network, storage, database and messaging services on a pay-as-you-go basis. In addition to Zynga’s popular FarmVille and CafeWorld, or games from Playfish and Wooga, many of the most exciting and popular Facebook apps are also running on AWS.

Here are a few examples:

RootMusic's BandPage app (currently the #1 Music App on Facebook, and #8 overall app on Facebook) helps bands and musicians build fan pages that will attract and hold the interest of an audience. RootMusic enables artists to tap into the passion their fans feel for their art and keep them engaged with an interactive experience. More than 250,000 bands of all shapes and sizes, from Rihanna and Arctic Monkeys, to bands you haven't heard of yet but may soon discover, have already made RootMusic’s BandPage their central online space for connecting with their fans. Artists use it to share music, release special edition songs/albums here, share photos, and list events/shows. BandPage now supports 30 million monthly active users from all over the world. Behind all the capabilities that ignite BandPage’s music fan communities lies a well-thought out, highly-distributed and highly-scalable backend, powered by Amazon Web Services:

"In 20 seconds, we can double our server capacity. In a high-growth environment like ours, it's very important for us to trust that we have the best support to give to the music community around the world. Five years ago, we would have crashed and been down without knowing when we would be back. Now, because of Amazon’s continued innovation, we can provide the best technology and scale to serve music communities needs around the world,” Christopher Tholen, RootMusic CTO.

Funzio's Crime City is #7 in the top 10 Facebook apps, and it’s the highest rated Facebook game to reach 1 million daily users with an average user rating of 4.9 out of 5. Crime City currently has 5.5 million monthly active users, with 10 million monthly active users at its peak. The iPhone version was recently listed among the top 5 games in the Apple Appstore and #1 free game in 11 countries and counting. Crime City sports modern, 3D-like graphics that look great on both Facebook and iPhone, and has a collection of hundreds of virtual items that players can collect.

Powering this incredibly rich user experience across multiple platforms is their business acumen in promoting the app, as well as a strong backend that leverages many AWS products to serve their viral and highly active user base. Funzio uses Amazon EC2 to quickly scale up and down based on demand, Amazon RDS to store game and current state information. They use Amazon CloudFront to optimize the delivery to a global, widely-distributed audience and to meet Facebook's SSL certificate requirements.

"At Funzio, we use AWS exclusively to host the infrastructure for our games. When developing social games, you need to be ready for that traffic burst for a hit game in a moment's notice. AWS provides us with the flexibility to quickly and efficiently scale our applications at all layers, from increasing database capacity in RDS, to adding more application or caching servers within minutes in EC2. Amazon's cloud services allow us to focus our efforts on developing quality games and not on worrying about managing our technology operations.” - Ram Gudavalli, Funzio CTO.

50Cubes, the creator of Mall World, is a startup that has developed one of the most highly-regarded and longer-running successful female focused social game on Facebook. With over 5 million monthly active users, Mall World has a track record of being not only one of the first but also the top game of its kind for the past 1.5 years and continues to entice users world-wide.

50Cubes powers Mall World and other games they developed with a suite of AWS products. Out of these, they value the Amazon Auto-scaling and EBS features the most – these products helps them effortlessly scale up and down their exclusive use of Amazon EC2 instances with user demand. Their database clusters are a mix of MySQL and other key value storage databases, all hosted and managed by the team on Amazon EC2 using EBS for Cloud Storage.

"One thing that impresses me the most about AWS services is that they have rapidly iterated and improved their products and services over the past year and half, executing almost like a startup of our scale." - Fred Jin, 50cubes CTO.

“AWS is great for Facebook developers – you can start small, test and prove your ideas. As your app grows, you can easily scale up your resources to keep your users engaged and connected. AWS allows developers to build highly-available, highly-scalable, cost-efficient apps that provide the type of rich and responsive user experiences that our global audience has grown to expect.”

Within Amazon, we often use the phrase "drinking our own champagne" to describe our practice of using our own products and services to prove them out under actual working conditions. We build products that we can use ourselves. We believe in them.

Amazon's Corporate IT recently wrapped up an important project and they have just documented the entire project in a new technical whitepaper.

Amazon's Corporate IT team deployed its corporate intranet to Amazon EC2 instances running Microsoft SharePoint 2010 and SQL Server 2008, all within a Virtual Private Cloud (Amazon VPC). This is a mission-critical internal corporate application that must deal with a large amount of very sensitive data.

During the deployment process our Corporate IT team treated AWS as they would treat any other vendor. They leveraged the same products that our other customers use. They paid for the AWS Premium Support service and received pre-implementation advice from our AWS Solution Architects the same way we give to other enterprise customers. They conducted a thorough security review and decided to encrypt all data at rest and in flight.They used EBS snapshots to reduce the risk of losing data, and also implemented a failover mechanism that can attach an existing EBS volume to a fresh EC2 instance when necessary.

This project involved commercial software licenses and demonstrates that the flexibility of AWS allows our customers to run commercial enterprise-grade software (like Microsoft SharePoint and SQL Server Enterprise) in the cloud. The whitepaper not only discusses the technical architecture and implementation details but also how you can leverage key security features (like Windows DPAPI for Key management) to further enhance the security and reliability of your applications. Today, with Microsoft License Mobility with Software Assurance, you can bring your existing licenses of several Microsoft Windows server applications to the cloud.

Real benefits emerged:

Infrastructure procurement time was reduced from over four to six weeks to minutes.

Server image build process that had previously taken a half day is now automated.

Operational overhead of server lease returns were eliminated, freeing up approximately 2 weeks of engineering overhead per year by replacing servers with equivalent cloud resources.

Today, you can run enterprise software from Microsoft, Oracle, SAP, IBM and several other vendors in the AWS Cloud. If you are an ISV and you'd like to move your products to the cloud, we're ready to help. The AWS ISV program offers a wide variety of sales, technical, marketing, PR, and alliance benefits to qualified ISVs and solution providers.

The paper is a great example of how a complex mission-critical application can be deployed to the cloud in a way that makes it more reliable, more flexible, and less expensive to operate. Read it now and let me know what you think.

Update: We are checking with our team-mates to see if we can release some of the documentation and scripts described in the whitepaper. It appears that encryption of EBS volumes is a topic of interest to many people!

Do you use EC2 Spot Instances in your application? Do you understand how they work and how they can save you a lot of money? If you answered no to any of these questions, then you are behind the times and you need to catch up.

I'm dead-serious.

The scientific community was quick to recognize that their compute-intensive, batch workloads (often known as HPC or Big Data) were a perfect fit for EC2 Spot Instances. These AWS customers have seen cost savings of 50% to 66% when compared to running the same job using On-Demand instances. They are able to make the best possible use of their research funds. Moreover, they can set the Spot price to reflect the priority of the work, bidding higher in order to increase their access to cycles.

Our friends over at Cycle Computing have used Spot Instances to create a 30,000 core cluster that spans 3 AWS Regions. They were able to run this cluster for nine hours at a cost of $1279 per hour (a 57% savings vs. On-Demand). The molecular modeling job running on the cluster consumed 10.9 compute years and had access to over 30 TB of RAM.

Harvard’s Laboratory for Personalized Medicine (LPM) uses Amazon EC2 Spot Instances to run genetic testing models and simulations, and stretch their grant money even further. One day of engineering allowed them to save roughly 50% on their instance costs moving forward.

Based on the number of success stories that we have seen in the scientific arena, we have created a brand new (and very comprehensive) resource page dedicated for scientific researchers using Spot Instances. We've collected a number of scientific success stories, videos, and other resources. Our new Scientific Computing Using Spot Instances page has all sorts of goodies for you.

Detailed technical and business information about the use of Spot Instances for scientific applications including a guide to getting started and information on migrating your applications.

Common architectures (MapReduce, Grid, Queue, and Checkpoint) and best practices.

Additional case studies from DNAnexus, Numerate, University of Melbourne/University of Barcelona, BioTeam, CycleComputing, and EagleGenomics.

A list of great Solution Providers who can help you get started if you need a little extra assistance migrating to Spot Instances.

Documentation and tutorials.

Links to a number of research papers on the use of Spot Instances.

Other resources like our Public Data Sets on AWS and AWS Academic programs.

Spot Instances work great for scientific Research, but there are a huge number of other customers out there that also love Spot. As an example Spot works really well for loads of other use cases like analytics, big data, financial modeling, geospatial analysis, image and media encoding, testing, and web crawling. Check out this brand new video for more information on common use cases and example customers who leverage them.

Again, if you don't grasp the value of Spot Instances, you are behind the times. Check out our new page and bring yourself up to date today.

If you have a scientific computing success story of your own (with or without Spot) or have feedback on how to make Spot even better, we'd love to hear more about it. Please feel free to post a comment to the blog or to email it to us at spot-instance-feedback@amazon.com.

Finally, if you are excited about Spot and want to join our team, please contact Kelly O’Mara at komara@amazon.com to learn more about the team and our open positions.

Over the summer months, we'd like to share a few stories from startups around the world: what are they working on and how they are using the cloud to get things done. Today I’m speaking to Jonathan and Thomas, two of the creators of Scalarium, from Berlin, Germany!

R: Hi guys, could you briefly describe Scalarium and the background of your team?

Thomas:

With Scalarium, we've created an easy management service for EC2 clusters. Scalarium helps our customers deploy Rails, Node.js, PHP, Java, Python or any other stack. It automates the initial setup and continuous configuration of servers. Scalarium also takes care of scaling, security, monitoring, and a lot more.

We started as an IT consultancy in 2005 and used EC2 from the early days on to help our clients scale out. Doing so, we realized that we repeated ourselves in this kind of projects. So we created Scalarium as a framework that helps customers automate EC2 deployments.

R: How have you incorporated Amazon Web Services as part of your own architecture? What services are you using and how?

Jonathan:

We heavily use EC2, EBS and S3. And in our stack you will find Ruby, CouchDB, Redis, RabbitMQ, Chef and other nice and shiny stuff. We brought you a little illustration that shows you how we run Scalarium on EC2. But before that, you will need to understand a little more about what we do.

As said, Scalarium helps customers run apps on EC2. But instead of offering you some restrictive and expensive PaaS solution, we offer you an elegant way to automate everything on your servers. So you will still maintain root access to all servers and are able to configure each and every setting.

R: How does Scalarium help customers run apps on AWS?

Thomas:

In the cloud, each server goes through something that we would describe as a server life cycle. Each and every server in your cluster comes to existence at some time, it experiences some changes and it goes at some point later. Some of them have a rather short lifespan like application servers that are used to burst out, others have long lifespans like database servers. But all of them go trough this cycle.

We defined events in this life cycle which we and you can hook into to execute scripts on the servers. The life cycle events that are used in Scalarium are the following ones.

Setup is used to update a base image and install everything you need on the fly as soon as the server comes into existence.

Configure is triggered by any change in the cluster - new servers coming or old ones going.

Deploy executes scripts that should run during the deployment of an application on the servers. You can hook into the deployment with before_migrate or any other hook you know from Capistrano.

Undeploy - this is triggered if you want to remove an application.

Shutdown is triggered if you gracefully stop a server. You can copy stuff around or inform other servers about the absence of the server in advance.

Now imagine a very basic setup with one load balancer, a couple of app servers and a single database.What would you need to do if you wanted to add another app server to your stack? (click image below to enlarge)

You would need to boot an AMI (Amazon Machine Image), log in to the machine, install updates and dependencies, configure all services, cron jobs and so on and last but not least deploy your application. But you are not done yet. You also need to log in to the database server and grant access to the new app server by adding the IP to your ACL. After that you have to log in to your load balancer and add the app server to the load cycle.

This procedure is rather tedious even for easy and basic setups like this, but as you can imagine, the number of dependencies and tasks grows very fast as soon as you have more tiers and servers in your cluster.

What would you do if one of your servers dies or isn’t reachable due to some temporary network issues? Have a look at the Netflix Tech Blog and learn about the chaos monkey and his friends if you think your servers will be always on and flawless forever.

We created Scalarium to take care of this type of concerns automatically. You can extend the abilities of Scalarium as you like because you can react to all life cycle events and hook into them. This enables you to do just about everything. You always start with a “vanilla” OS and in the end you have a totally customized setup on your server and all other servers in the cluster know how to react and reconfigure themselves. We offer a broad selection of predefined stacks and examples. You can change them easily or add your own ones.

R: How does the bootstrapping of an instance work?

Jonathan:

In this picture you see roughly what happens behind the curtains if a new server is added to a cluster (click image to enlarge):

As soon as a new server is requested, we ask Amazon for it. Once the server finished booting it downloads the Scalarium agent and a custom certificate, installs the agent and connects back to Scalarium in an encrypted and signed way. We check what kind of server you instructed it to be and execute the appropriate Chef recipes. Chef is an open-source system integration framework, similar to Puppet or CFEngine. Check out our example cookbooks on github to get a feeling about how easy it is to use Chef. You will find the main Scalarium cookbooks there too.

The server bootstraps and will be your new app server, database or whatever you wanted it to be. This process usually takes just one or two minutes depending on the stack you want to install and the size of the server.

After successful bootstrapping of a new server, all existing servers in the cluster get informed. This step is very important. Because now, recipes bound to the configure event are executed on each server in the cluster. That way, load balancers can execute recipes that ensure that they are aware of all running app servers and that they can safely remove stopped app servers from their load cycle. A database server can check if it has granted access to the available app servers. But of course you also could do advanced things like adding new database servers and re-balance your data, update your nagios alerting or your graylog2 server to catch all the logs you want.

If you are done with your basic setup you can easily add time or load based servers, add and deploy applications to your cluster or clone the complete environment to create a staging system. All that can be done via the UI or the Scalarium API.

R: How do you run on AWS yourself?Thomas:

Below is a simplified visualization of our own architecture. We use two main databases for Scalarium. One is CouchDB, used to store information like the cluster configurations, server descriptions and current state, applications, deployment definitions. The other one is Redis, used for accounting, events, monitoring and metering data.

We chose CouchDB for high availability, easy replication, clustering, robustness, and a short recovery time. Redis is awesome for the very dynamic, fast growing, and non critical data we have.

Our setup spans multiple regions and availability zones to guarantee a high uptime. CouchDB’s awesome replication features are used to have a master/master replication across regions. Redis uses a master/slave setup for data replication. (click image below to enlarge)

R: Why did you decide to use AWS?Thomas:

That’s simple. We use AWS because it’s the only big, global distributed and reliable source for IaaS out there. Amazon kicks some serious ass and develops tons of new features and services. Last but not least we eat our own dog food - Scalarium runs on Amazon and is managed with Scalarium.

By using AWS and Scalarium we can grow in no time to handle as many customers we like, spin up staging environments, deploy fast and often and do all that completely automated. All fail over, scaling, backup tasks, monitoring and so on is automated. You will love doing that. You can concentrate on developing your app without hassling with data centers and servers.

Amazon enables us to have clients ranging from start ups with one server, over SaaS offerings and agencies with a couple of servers, to the world’s biggest social game providers like wooga or Plinga with an incredible number of servers running their games all over the globe.

Yes. Take part in the Global AWS Start-Up Challenge! It is a short application form. You can win cash, AWS credits and get a lot of visibility. And if that’s not enough we give every semi finalist half a year free Scalarium on top.