Category: Amazon EC2

NASA Jet Propulsion Lab, NASA’s lead center for robotic exploration of the solar system is doing some pretty extraordinary things with the AWS cloud. I had the opportunity to meet with their CTO to discuss some of the interesting projects they are working on. Like the early explorers of Deep Space, they were also early explorers of the AWS cloud. Cloud computing is high up on JPL CTO’s radar and various teams are leveraging the cloud for various different projects related to missions.

They operate 19 spacecraft and 7 instruments across the solar system. Each instrument sends tons of data and images, which needs to be analyzed and processed. From one such instrument, they processed and analyzed 180,000 images of Saturn in the AWS cloud in less than 5 hours. This would have taken them more than 15 days in their own data center.

Last April, I was at EclipseCon and I was stunned when I saw the e4 Mars Rover Application, which was hosted in the AWS cloud, built by NASA folks, to control the movement of Mindstorm robots. I started to wonder what will be the result of when we combine the potential of NASA, the AWS cloud and robots.

The super smart folks at NASA Jet Propulsion Lab (Jeff Norris, Khawaja Shams and team) developed a system that allows developers to interact with a Mindstorm robot in 4X4 Arena. The system exposes a REST API that takes commands from e4 clients and sends them to the app (RESTful Server running Equinox) hosted on Amazon EC2. The app stores all the commands (using conditional PUTs) in Amazon SimpleDB. The Poller application on the robot (Arena server) pulls the commands sequentially from a developer immediately by consistently reading from Amazon SimpleDB and executes the commands on robot which makes the robot move in the Arena. A sample command looks like Set the velocity of the robot to 5 and move to the left. In essence, developers will be able to send commands to the app on the AWS cloud and move the robot (in the arena). The system also stores all the logs, game queues, registration and scores in different Amazon SimpleDB domains. The RESTful server app is sitting in front of a fault-tolerant Elastic Load Balancer so that it can serve multiple concurrent requests and route traffic to multiple EC2 Instances. The setup is configured to use the Auto-Scaling service and is ready to scale in case they get sudden surge in traffic from clients. So, any developer can send commands using the API and make the robot move; the app will elastically scale with demand. Wait, it does not stop here. The Arena has a camera that is capturing and storing 4 images every second (4 FPS) to Amazon S3 (using the versioning feature) after processing them to find the robot’s actual co-ordinates in the Arena. Millions of images get versioned and stored on Amazon S3 which can then be downloaded almost in real-time to get a live photo feed of the location of robot and its movement in the 4X4 arena. So, with this, any developer, from any where in the world, can actually write a program to send commands to the auto-scalable app hosted on EC2 and see the robot move in the arena.

In summary, large numbers of developers can write innovative clients to interact with the robot and because architecture is scalable and running on scalable on-demand auto-scalable infrastructure.

Kudos goes to NASA JPL team who not only conceived this idea but also built the system using the some of the most advanced features of AWS. They were already using some features which were released just a weeks ago for e.g. consistent reads and conditional puts from Amazon SimpleDB, versioning feature in S3 etc. They have what I call the “full house” of AWS that includes ELB, Auto-scaling, EC2. They are also planning to put together a video feed of the robot movements from the images and stream the video using CloudFront.

Architecture of the system is as follows:To entice developers to write innovative apps (client apps to manage robot movements, client apps to interpret the messages/logs from the robot and apps to move the robot itself to follow a path), they had announced a contest at Eclipsecon named “e4 Mars Rover Challenge”. They made this system available to EclipseCon developers and opened up the server only to EclipseCon attendees using EC2 Security Groups and limiting the access only to EclipseCon CIDR IPs. I was amazed at the creativity. Contestants built innovative client apps ranging from iPhone Apps (that use the accelerometer to manage the velocity and direction of the robot’s movement) to e4 intelligent clients that displays the telemetry. Winners have been announced. The event at EclipseCon was Mars-themed, and they have made the arena look like Mars with panoramas of Mars acquired by Spirit (Husband hill), and it had an orbiter, LED lights, and cool sound effects. This event was the ultimate crowd-pleaser at EclipseCon.

Imagine the potential when any developer can actually play with the robot from anywhere in the world and build innovative apps.

The fact that NASA engineers are utilizing cloud computing services to develop the contest brings a significant level of credibility to the whole notion of the cloud.

I was so excited after seeing the Arena at EclipseCon that I had to take the interview of masterminds behind this project. See the video below

Per the dialog above, you can associate a description and a billing code with the SNS topic, and you can also label the topic (and the other components of your application) to identify it as part of a particular application.

Their system also includes a number of other enterprise-grade features including key management, multiple user roles, automated failure recovery, reporting, and cluster-level management of EC2 instances. Learn more here or request a demo.

Today I would like to talk about ways to make your website run faster and more efficiently. There are several benefits from doing this including better search engine rankings, a better customer experience, and an increase in sales or conversions.

You can always improve your architecture, tune your database, and optimize your code. In each case you locate a bottleneck, figure out a way around it, and then proceed to do so. You basically spend your time answering the question “What part of my application is running too slowly and how can I speed it up?” This is the classical approach to performance tuning; many of the techniques to measure and to improve performance date back to the time before the web (you do remember those days, don’t you?).

In today’s web world, “where” can be just as important as “what.” Let’s talk about that.

The distance between the user and the application is a major factor in the performance of a website or a web-based application. Latency increases with distance, and increased distance also means that each network packet will take more “hops” through intermediate routers and switches as it makes its way from browser to server and back.

You probably can’t move your users closer to your application, but you can often deploy your application closer to the user. AWS provides you with two different ways to do this:

Choose a Region – You can put your application into any of four different AWS Regions. You should choose the AWS region that’s closest to the majority of your customers. As of this writing, you can choose to host your AWS-powered application on the east coast (Northern Virginia) or west coast (Northern California) of the United States, Europe (Ireland), or Asia Pacific (Singapore).

Use CloudFront Content Distribution – You can also speed up your application by using Amazon CloudFront to distribute web content (static HTML pages, CSS style sheets, JavaScript, images, file downloads, and video streams) stored in Amazon S3. Each user’s requests will be served up from the nearest CloudFront edge location. There are currently fifteen such edge locations: eight in the United States, four in Europe, and three in Asia.

You might be thinking that you run a “small” or “unimportant” site and that you don’t need or can’t benefit from CloudFront. Given that a lot of the value of the web is found in the long tail content, I disagree. There’s no harm (and plenty of potential benefit) from making your site quicker to load and more responsive. How do you think that the large, popular sites got that way? At least in part, they did so by being fast and responsive from the very beginning.

Or, you might think that CloudFront is somehow too expensive for you. Well, it’s not. I’ve been serving up all of the images for this blog from CloudFront for a while. Here’s my AWS account, reflecting an entire month of CloudFront usage:

That’s right, less than $2 per month to make sure that readers all over the world are able to get to the images in the blog as quickly as possible. If nothing else, think of this an inexpensive insurance policy that will protect you from overload in case your site shows up on SlashDot, Digg, Reddit, Hacker News, or TechCrunch one fine morning.

If you are interested in speeding up access to your application’s web content, start out by reading our guide to Migrating to Amazon CloudFront.This guide will walk you through the 5 basic steps needed to sign up for CloudFront and S3, download and install the proper tools, upload your content to an S3 bucket, create a CloudFront distribution, and link to the content.

If you are using a CMS (Content Management System), take a look at these:

Encoding.com has put together a guide to Apple HTTP Streaming with Amazon CloudFront. The guide includes complete step-by-step directions and gives you all the information you’ll need to stream video to an iPad or an iPhone.

There are a number of good testing tools that you can use to measure the speed of your site while you are in the process of migrating to CloudFront. Here are two:

Yahoo’s YSlow is a Firefox add-in and one of the best-known browser-based performance measurement tools. You’ll need to install FireBug and spend some time learning how to use YSlow and to interpret the results. YSlow will measure performance as seen from your desktop. If you plan to use YSlow, don’t forget to tell it that you are using CloudFront as a CDN by following these instructions.

Cloudelay measures and displays several different CloudFront performance metrics including request latency and network throughput.

BrowserMob‘s free Website Performance Test. I really like this one. You can run a test from four locations around the world and see how it performs when viewed from that location.

As an example of just how location affects perceived website performance, take a look at this chart from the BrowserMob test:

There’s a 4:1 ratio between the fastest and the slowest location. I compared the detailed output for two separate objects: the blog’s home page and the volcano picture that I posted last week. The results show that CloudFront produces results that are reasonably consistent,regardless of where the test is run from:

Location

Home Page

Volcano

Washington, DC

323 ms

140 ms

Dublin

631 ms

227 ms

San Francisco

45 ms

210 ms

Dallas

200 ms

252 ms

You can also see the amount of time it takes to look up each item’s DNS address, the time until the first byte of data arrives, and the amount of time spent reading the data:

Still need more info? Take a look at some of these case studies:

UrbanSpoon – “CloudFront took a day or two to implement. Switching to CloudFront helped us stay within our bandwidth caps, which saves us several thousand dollars a month.”

Linden Lab – “The Second Life Viewer is an application that each Resident runs on their own computer to interact with the Second Life world. Downloads of the Viewer were ideally suited for CloudFront delivery. The Viewer can be downloaded over 40 thousand times each day by different users all over the world and using CloudFront helps Residents download their software faster by storing copies at edge locations close to them.”

Existing toolkit and tools can make use of the new Region with a simple change of endpoints. The documentation for each service lists all of the endpoints for each service.

I know that RightScale will be supporting the new Region and will be migrating their templates and images in the next couple of weeks. They already have their infrastructure up and running in the Region to provide fast communication between managed instances and their platform. Read more about how they are accelerating cloud adoption with support for the new Region, Windows, and simplified image creation.

George Reese of enStratus wrote to let me know that all 46 of their AMIs are already up and running in the new Region. This includes all of the AWS services, user management, financial controls, infrastructure automation with automatic application deployment, multiple auto-scaling options, automatic recovery, encryption, and key management.

A number of customers are already planning to migrate their applications to the new Region. Here’s a sampling:

Singapore

Kim Eng, one of Asia’s leading securities brokers, is using AWS to host a trading application for the iPhone that will be launched in the near future. They chose AWS and the Singapore Region in order to minimize latency and to make sure that they can provide quick and efficient service even if there’s a sudden spike in usage.

iTwin, creators of the iTwin remote device for plug and play access to remote files, recently started using EC2. Lux Anantharaman, Founder and CEO, told us that “After evaluating multiple cloud service providers, we realized AWS is the only one which is globally available and which satisfies our deployment needs. AWS is extremely simple to configure and use, provides a comprehensive suite of web services which are unmatched by any other provider, and the billing system is convenient. Last but not least, our engineers love AWS.”

Australia

Altium provides software for electronic design. They have moved their content delivery over to Amazon S3 and CloudFront, reducing their bandwidth and storage costs to just 25% of that charged by their former provider. As their CIO, Alan Perkins, said, “At times we have thousands of clients asking for 1.5GB files at the same time and AWS has delivered without a glitch. ” They have also moved a lot of their research and development work to AWS.

Also in Australia, 99designs.com runs their crowdsourced graphic design site on AWS. They use a cluster of application servers on EC2, backed up by database servers and proxy servers that also run on EC2. They count on Amazon S3 to store massive amounts of data — a new design is uploaded every seven seconds. Lachlan Donald, CTO of 99designs.com, told us that “AWS has virtually eliminated our infrastructure concerns and allowed us to focus on our core business.”

India

Rediff provides online news, information, communication, entertainment, and shopping services. They’ve been using AWS for highly demanding data processing such as data mining and analytics. Sumit Rajwade, Rediff’s Vice President of Technology, told us that “The flexibility and scalability of the AWS platform is unparalleled and we are pleased that AWS is opening up a region in Asia Pacific.”

Indiagames decided to use AWS to launch and run their popular T20Fever game on Facebook. They turned to AWS as an alternative to building and scaling their own infrastructure. As a result, they were able to keep their staffers focused on development instead of on managing physical hardware. Their application makes use of EC2, S3,RDS, and CloudFront. According to Vishal Gondal (CEO), “By leveraging Amazon EC2, Amazon S3, Amazon Relational Database Service (RDS), and Amazon Cloudfront, we’ve been able to handle thousands of gamers concurrently without having to spend a rupee on physical infrastructure.”

We also need to hire some Solutions Architects, Technical Sales Representatives, a Data Center Manager, and some Data Center Technicians in Singapore. If you are interested in any of these positions, please send your resume or CV to tina@amazon.com .

The new product, CloudTest Analytics, builds on SOASTA’s existing CloudTest product. It consists of data extraction layer that is able to extract real time performance metrics from a number of existing APM (Application Performance Management) tools from vendors such as IBM, CA, RightScale, Dynatrace, New Relic, Nimsoft. The data is pulled from the entire application stack, including resources that are in the cloud or behind the firewall or at the content distribution layer. All of the metrics are aggregated, stored in a single scalable data warehouse, and displayed on a single, correlated timeline. Performance engineers can use this information to understand, optimize, and improve application and system performance.

Of course, CloudTest runs on Amazon EC2 and is available on a cost-effective, on-demand basis.

If you are interested in learning more about cloud-powered load and scale testing, you may find this recent article to be of value. SOASTA used 800 EC2 instances (3200 cores) to generate a load equivalent one million concurrent active users. This test ultimately transferred data at a result of 16 gigabits per second (6 terabytes per hour).

In January I wrote about the availability of a conceptual whitepaper describing various scenarios for using Windows ADFS to federate with services running on Amazon EC2 and mentioned that a step-by-step guide was forthcoming. I’m very pleased to announce that the guide is now finished and available for download. To give you a flavor for what you can learn by following the steps in the guide, I’ll quote from its introduction:

This document provides step-by-step instructions for creating a test lab demonstrating identity federation between an on-premise Windows Server Active Directory domain and an ASP.NET web application hosted on Amazons Elastic Compute Cloud (EC2) service, using Microsofts Active Directory Federation Services (ADFS) technology. The document is organized in a series of scenarios, with each building on the ones before it. It is strongly recommended that the reader follow the documents instructions in the order they are presented. The scenarios covered are:

Corporate application, accessed from anywhere: External, not-domain-joined client (i.e. at the coffee shop) accessing the same EC2-hosted application, using ADFS v1.1 with an ADFS proxy. In addition to external (forms-based) authentication, the proxy also provides added security for the corporate federation server.

Service provider application: Domain-joined and external Windows clients accessing an EC2-hosted application operated by a service provider, using one ADFS v1.1 federation server for each organization (with the service providers federation server hosted in EC2) and a federated trust between the parties.

Service provider application with added security: Same clients accessing same vendor-owned EC2-hosted application, but with an ADFS proxy deployed by the software vendor for security purposes.

We hope you find this information useful and that it helps to simplify migrating existing applications or developing entirely new solutions that leverage the power of Amazon EC2 with your existing internal IT environment.

Amazon EC2’s Elastic Load Balancing feature just became a bit more powerful. Up until now each load balancer had the freedom to forward each incoming HTTP or TCP request to any of the EC2 instances under its purview. This resulted in a reasonably even load on each instance, but it also meant that each instance would have to retrieve, manipulate, and store session data for each request without any possible benefit from locality of reference.

Suppose two separate web browsers each request three separate web pages in turn. Each request can go to any of the EC2 instances behind the load balancer, like this:

When a particular request reaches a given EC2 instance, the instance must retrieve information about the user from state data that must be stored globally. There’s no opportunity for the instance to cache any data since the odds that several requests from the same user / browser will go down as more instances are added to the load balancer.

With the new sticky session feature, it is possible to instruct the load balancer to route repeated requests to the same EC2 instance whenever possible.

In this case, the instances can cache user data locally for better performance. A series of requests from the user will be routed to the same EC2 instance if possible. If the instance has been terminated or has failed a recent health check, the load balancer will route the request to another instance. Of course, in a real world scenario, there would be more than two users, and the third EC2 instance wouldn’t be sitting idle.

A couple of members of the AWS Developer Support team put together the following tips and tricks to help you get the most from the Elastic Load Balancer.

— Jeff;

Are you thinking about using Amazon EC2 with Elastic Load Balancing, but want to make sure you set it up right the first time? Are you already using an ELB but are seeing intermittent problems with your page loads? Well, you’ve come to the right place! Let’s uncover a couple of common pitfalls. They’re easy to avoid, once you know about them.

For those of you who aren’t familiar, Elastic Load Balancing helps you distribute incoming network traffic across multiple Amazon EC2 instances. Your Elastic Load Balancer (ELB) will automatically route traffic to only the EC2 instances it deems to be healthy, so you needn’t worry about manually enforcing which instances handle the traffic. With ELB, it’s all manged for you. Furthermore, your ELB will also scale itself up and down to meet the demands of your traffic load. You can ensure that the EC2 instances themselves do the same by using Amazon Auto Scaling, but that’s beyond the scope of today’s discussion. You can read more about Auto Scaling here.

A key feature of ELB is that it will distribute incoming traffic equally across all of the Availability Zones you’ve configured it to use. This means that if you enabled, say, Availability Zones us-east-1a and us-east-1d, but only registered instances in us-east-1a, half of your traffic will go to us-east-1d, but there will be no EC2 instances there to handle it. The traffic will be redirected back to us-east-1a, but this redirection could increase latency for your users. Thus, you’ll want to keep track of which Availability Zones your ELB is set up to use. You can use the ELB API command line tools to do this:

elb-describe-lbs –show-xml

If you don’t already have the ELB command line API tools, then you can grab them here. Once you know which Availability Zones are enabled for your ELB, you can run this next command to see which instances are currently registered with your ELB:

elb-describe-instance-health Your_ELB_Name_Goes_Here

The command above will return the instance IDs of the registered instances, which you can then use to determine which Availability Zones each is in:

ec2-describe-instances instance_ID_1 instance_ID_2 …

You can glean a lot of potentially useful information at this point regarding each of the instances behind your ELB. Here are some things to check:

1) Does each enabled Availability Zone contain at least one instance registered with your ELB?

If not, you have two approaches to remedy the situation. The quick fix is to simply disable the empty Availability Zones:

Great, so at this point you should have at least one instance in each of your ELB’s Availability Zones. But could we still strengthen the setup even further? How about digging into the details of the individual instances behind your ELB? How robust is your configuration? This brings us to a second important item to check:

2) Do you have an equal number of instances in each Availability Zone? And are they the same type?

Since your ELB will distribute incoming traffic equally across your Availability Zones, you really don’t want to have, say, one m1.large instance in us-east-1a and five c1.medium instances in us-east-1d. The single m1.large instance will receive roughly 50% of all of your traffic and, under high traffic volume, may not be able to keep up. Meanwhile, your five c1.medium instances are each under a much lower load. This is definitely a suboptimal arrangement.

The ec2-describe-instances command above returned not only the Availability Zone of each instance but also its instance type. We suggest populating each Availability Zone with an equal number of instances of the same type. You may even want to check if they are all based on the same AMI. Cycling out older instances for replacements based on your most recent AMI will help ensure that your instances remain up-to-date and service requests in a consistent manner, and can simplify debugging in the future.

We hope this helps you understand Elastic Load Balancing. Do you have more questions? Post them to the AWS forums!

Update: George Cook left a good question as a comment. I took it to the leader of the ELB team and here’s what he told me:

Thank you for the feedback. You make some good points. Under certain failure modes, the behavior you described is the right thing to do, and that is on our roadmap. However, in other cases, it is still necessary to bounce traffic between Availability Zones. For example, it is possible that all instances in an Availability Zone become unhealthy (or get deregistered) while there are requests in-flight to that Availability Zone. The load balancer will then bounce these requests to a different Availability Zone in order to minimize any failed requests.

We’ve had more than our fair share of technical challenges along the way, but the time is right for me to talk about our newest product, the Quantum Compute Cloud, or QC2 for short.

This is the first production-ready quantum computer. You can use it to solve certain types of math and logic problems with breathtaking speed.

Ordinary computers use collections of bits to represent their state. Each bit is definitively 0 or 1, and the number of possible states is 2n. 1 bit can be in either of 2 states, 2 bits can be in any one of 4 states, and so forth.

Quantum computers such as the QC2 use a more sophisticated data representation known as a qubit or quantum bit. Each qubit exists in all of it’s possible states simultaneously, but the probability that a qubit can be in any of the states can change. Quantum computers work by manipulating the probability distribution of each state.

How do you program a quantum computer? With quantum algorithms, of course. Pretty much everything that you know about traditional programming becomes obsolete when you step up to the QC2. You need to think in terms of probabilities, distributions of probabilities, and so forth. Take a look at Shor’s Algorithm for finding prime factors to get a better idea of the power of a quantum computer.

We are also planning to support Bernhard Omer’s QCL programming language. Take a look at his thesis on Structured Quantum Programming to learn more. Here’s a QCL code sample to get you started:

Once you’ve launched a QC2 instance and loaded up your algorithm, you must sample the output (also known as “collapsing the quantum state“) in order to retrieve the probability distribution which represents your answer. You’ll want to do this more than once for any particular problem in order to increase your confidence in the solution. Collapsing the quantum state is a destructive operation (much like reading from a magnetic core memory); be sure to account for this in your algorithm. In effect, the answer doesn’t exist until you ask for it.

Until now, the largest quantum computer contained less than 8 qubits. Because we’re really, really smart, we’ve been able to push this all the way to 32 in the first-generation QC2. This will allow you to represent problems with up to 232 distinct states.

We’re launching the QC2 in the US East Region in multiple Availability Zones. using the amazing “spooky action at a distance” property of quantum entanglement, you can actually replicate QC2 instances across Zones.

The QC2 beta is limited, and will definitely close before the end of the day.

— Jeff;

PS – We need to hire lots of world-class people to help us with leading edge technologies like QC2, EC2, and the like. Please check out our AWS jobs page.

As the final talk of my trip to the East Coast, I will be speaking to the RubyNation conference in Reston, Virginia on Saturday, April 10th. I worked in Reston back when the unofficial motto was “We’re not dead, we’re Reston.” Things have livened up considerably since then and I’m looking forward to connecting with some old friends and colleagues while I am in the area.

There will be a Mechanical Turk Meetup in New York at 6:00 PM on April 13th. Learn more about Mechanical Turk‘s global on-demand workforce, discover best practices, talk to existing Requesters, and mingle with members of the Mechanical Turk team. Preregister here.

Terry Wise, Director of Business Development for the Amazon Web Services, will be speaking at PegaWORLD in Philadelphia on April 26th. Terry will talk about how Tenet Healthcare uses Pegas Cloud Computing solution to radically improve the way it builds its business process applications, reducing delivery time and cost by a factor of 5. Discount registrations for the conference are available here.

— Jeff;

PS – Despite the route implied by my map, I will be traveling by plane and train!