If been a big fan of Amazon Web Services (AWS) because they lower the costs of startup experimentation. I’ve sponsored their events, judged their startup competition, etc. I have friends on the team. I’ve also had frank conversations with them about service level agreements and what it means to be an infrastructure provider in a mashup world. Mashups increase the need for high availability and uptime. If the user experience of a mashup application requires, say, five web services from three separate companies to be available the overall probability of failure goes up subtantially. it’s the weakest link in the chain argument.The Net learned this the hard way yesterday when multiple AWS services (S3, EC2, SQS, Simple DB, etc.) had a multi-hour outage. The problem was exacerbated by the fact that, internally, various AWS services depend on one another and especially the storage service, S3.It looks like the cause for the outage was a particular use pattern of S3:

What caused the problem however was a sudden unexpected surge in a particular type of usage (PUT’s and GET’s of private files which require cryptographic credentials, rather than GET’s of public files that require no credentials). As I understand what Kathrin said, the surge was caused by at least one very large customer plus several other customers suddenly and unexpectedly increasing their usage.

I would highly recommend for anyone who is building a developer community or providing SaaS infrastructure or relying on SaaS infrastructure to take the time and read the many posts on the AWS forums about the outage. You hear the real pain and frustration of people whose businesses depend on AWS. The key complaint was not that the service failed–failures do happen–but that Amazon was not prepared to engage with the developer community around the failure.

It’s AmazING the fact of having no info on what’s happening. Absolutely unacceptable. Come on, people on this forum are all tech guys, so we understand that bad things happen from time to time. However, you MUST be transparent with your customers and give them details on what’s going on (yes, we want to know exactly what’s happening and not a standard response like ‘The issue is resolved’). In fact, it is not. So please, scale these complaints to the right person and post the technical explanation of the issue as soon as possible.

As I said before, you need to be transparent with your customers. No service can provide 100% uptime. It’s a fact. No matter if u have a redundant anycast network or supercalifragilisticexpialidocious elastic clouds. I just want to get notified and know what’s exactly happening. Nothing else. That said, the issue was resolved very fast, so you should be very proud. Hats off to Amazon’s IT staff.

There is a very important question you have to ask yourself before deciding whether to use S3: what are you really looking for – remote storage, content delivery, or both. These are crucial to distinguish.

What I observe is that most people treat Amazon S3 as a content delivery service. While this is not inherently wrong, one has to notice that S3 was especially designed to be a STORAGE service. S3 does not claim to be a CDN.

The point is, since terabyte hard drives are affordable nowadays and internet traffic grows steadily, the stress goes much more on content delivery and network infrastructure rather than on storage. If you are not concerned about using remote storage, there are much better services especially suited for content delivery.

SteadyOffload.com provides an innovative, subtle and convenient way to offload static content. The whole mechanism there is quite different from Amazon S3. Instead of permanently uploading your files to a third-party host, their cachebot crawls your site and mirrors the content in a temporary cache on their servers. Content remains stored on your server while it is being delivered from the SteadyOffload cache. The URL of the cached object on their server is dynamically generated at page loading time, very scrambled and is changing often, so you don’t have to worry about hotlinking. This means that there is an almost non-existent chance that the cached content gets exposed outside of your web application.

It’s definitely worth trying because it’s not a storage service like S3 but exactly a service for offloading static content.

I like the idea and how easy it would be to integrate SteadyOffload into a site. I don’t like that there isn’t any information on the site about who these guys are, what’s their network like, etc. Also, 99.9% availability is nothing to boast about.

Yep, the main point in my comment was that S3 is a good storage solution but definitely not the best content delivery solution.

It’s a young startup company located still only in Europe. But the idea seems very promising, isn’t it? They also provide their customers with detailed stats in the control panel of the service, which is by the way implemented in Adobe Flex.