Ahead of the upcoming 2nd annual re:Invent conference, inspired by Simone Brunozzi’s recent presentation at an AWS Meetup in San Francisco, and collected from a few of my recent Fluxcapacitor.com consulting engagements, I’ve compiled a list of 10 useful time and clock-tick saving tips about AWS.

1) Query AWS resource metadata

Can’t remember the EBS-Optimized IO throughput of your c1.xlarge cluster? How about the size limit of an S3 object on a single PUT? awsnow.info is the answer to all of your AWS-resource metadata questions. Interested in integrating awsnow.info with your application? You’re in luck. There’s now a REST API, as well!

Summary: degradation of a Northern California common peering point between the us-west-1 and us-west-2 regions led to the loss of 2 simultaneous AZ’s - each running in a different region. Those relying on this deployment configuration for things like HA, quorum (ZooKeeper, in PagerDuty’s example), or load balancing were hosed for those 86 mins on April 13, 2013.

4) Use ZFS with EBS

Two great tastes come together.

Per the AWS documentation, EBS volumes can expect an annual failure rate (AFR) of between 0.1-0.5% compared with commodity hard disks that have an AFR of around 4%; where failure is a complete loss of the volume.

To protect yourself against this risk (albeit small), you can pool multiple EBS volumes within ZFS in a RAIDZ configuration. While relatively undocumented, Chip Schweiss has a great blog post - as well as some useful scripts - detailing this topic.

5) Stripe your RDS disks for better performance

There’s an easy trick to maximizing the performance of your MySQL and Oracle RDS instances. Initially, provision your instance to the minimum size possible. Then slowly increase your storage by 5GB increments. Each increment will create an additional disk stripe which will increase IO and reduce seek time.

This, and many other optimization tips, can be found at the EX-AWS Engineer IAMA.

6) Avoid noisy neighbors

In any multi-tenant, non-dedicated, virtualized environment like AWS, you will certainly experience the noisy neighbor effect measured by CPU Steal. CPU Steal is the percentage of time that your virtualized guest is involuntarily waiting for physical host CPU from another guest. Great explanations of CPU steal can be found at Stackdriver’s blog here and here.

When using EBS-backed EC2 instances, you can flee a noisy environment by simply stopping and starting the instance. Of course, you’ll need to make sure your healthcheck policies don’t terminate the instance before it restarts. For S3-backed EC2 instances, the only option is to terminate/launch a new instance.

You can avoiding noisy neighbors by preferring larger EC2 instance types. Larger virtualized guest instances require more physical host resources - and therefore the host can support less overall tenants. Tenants are always of equal instance type. t1.micro’s are extremely noisy and bursty - similar to my old college-dorm neighbors above.

The most effective way to avoid noisy neighbors, however, is to pay extra for dedicated EC2 instances. Can’t remember how much a dedicated instance costs? Our buddies at awsnow.info have your answer.

7) Embrace CloudFormation

AWS has developed a tool called CloudFormer that allows you to create CloudFormation templates from your existing AWS resources. This tool is as slick and mysterious as CloudFormation itself.

Maintenance of CloudFormation templates is not easy. However, you can use #include-like macros on your template sources to cobble together per-resource snippets into a complete template. This facilitates template reuse and snippet comments.

The new unified AWS CLI has support for CloudFormation templates per a recent post by AWS’s Movember-friendly Evan Brown.

Different CPU architectures enable better performance for certain workloads depending on the NUMA characteristics, cache sizes, number of cores, number of hardware threads, GPU, etc.

Virtualized environments like AWS do not guarantee a particular CPU architecture. Physical hardware is being refreshed throughout AWS data centers on a daily basis. One of AWS’s oldest regions, us-east-1, contains a few different types of CPUs including AMD Opteron, Intel Xeon, and Intel Sandy Bridge. My British colleague, Adrian Cockroft, from Netflix loves Sandy Bridge as much as mayonnaise on his fries.

If your workload is sensitive to such physical CPU architecture, it’s best to detect this at startup (cat /proc/cpuinfo on Linux) and restart/terminate the instance. If you’re sensitive to cost, you can run the instance for 55 minutes, then terminate it. With AWS, you’re paying for the hour either way.

9) Use Virtual Private Cloud (VPC) from the start

VPC is now the default configuration for new accounts in most regions. Retrofitting a non-VPC environment to use VPC is very time-consuming task - particularly in large deployments with lots of security groups.

There is currently no migration path for non-VPC security groups (ingress only) and VPC security groups (ingress and egress). In fact, Netflix is still running non-VPC given this lack of security-group migration.

Another benefit is that VPC enable ELB’s as a middletier load balancer behind a private subnet. Without VPC, ELBs are public-facing.

A common configuration is to use ELB with HAProxy to enable more fine-grained control over the load balancing algorithms since ELB’s are black boxes and can be stickier than expected.

Reader Comments (5)

Chris,if you look for "5 things you don't know about Amazon Web Services" on Slideshare (they are actually 9!), you will notice that I have covered a few of your points, specifically ZFS, awsnow.info, and deleting buckets.

Nice details Craig! It was amazing for me to discover this post while I was writing a very similar one just yesterday :) You can check it out at http://www.elekslabs.com/2013/11/aws-10-things-youre-probably-doing.html

I'm jealous that something similar that I wrote isn't getting the same amount of attention! I should talk about what I wrote as if you are duplicating my work, then end by faintly praising your post so that I don't come across as a total dick.

Note that new instance types such as c3.* explicitly state processor architecture in AWS documentation - in this case, IvyBridge. No need to cherry pick to guarantee you can use proc-specific optimizations for these new ones.