I'm looking to optimize the cost of our auto-scaling EC2 groups by having them launch spot instances instead of on-demand instances.

What I really want is to be able to keep some servers in the group as on-demand instances, regardless of what happens to the spot instance pricing market. Then I want any additional servers in the group, above my configured minimum, to be spot instances. I'm generally OK with the delay in adding servers via spot requests.

I can't seem to find any way to do this and I've tried to scour the AWS documentation. It appears that an ASG can either be on-demand or spot, but not a hybrid.

I could possibly manually add an on-demand instance to the Elastic Load Balancer assigned to the auto-scaling group, but then the load of that server would not be factored into the auto-scaling measurements and triggers.

I suppose I could enter a ridiculously high bid price in order to ensure that I always get the servers I need, but then I look at the pricing history and see occasional large spikes.

The AWS documentation is at odds with itself, since in one place it says that if you enter a server minimum, that number is "ensured" to be there. But then when you read about spot instances, there are no assurances. The price differential for spot is compelling, so I'd like to leverage that as much as I can while still maintaining an always-on baseline. Is this possible?

2 Answers
2

This hybridAuto Scaling approach doesn't seem to be available out of the box indeed, unfortunately.

However, you might be able to work around this limitation as follows (untested, just a system design I've been juggling around for a while):

Potential Workaround

As outlined in Using Auto Scaling to Launch Spot Instances, the spot price bid is a parameter of the Launch Configuration in use. As you pointed out, there is no hybrid launch configuration available, rather it must be either on-demand or spot, which means the use case requires two different launch configurations.

This doesn't seem to help right away, because You can attach only one launch configuration to an Auto Scaling group at a time, with the following (partially outdated) constraints (see Launch Configuration):

When you attach a new or updated launch configuration to your Auto
Scaling group, any new instances will be launched using the new
configuration parameters. Existing instances are not affected. When
Auto Scaling needs to scale down, it first terminates instances that
have an older launch configuration. [emphasis mine]

The emphasized parts are key though, with the former covering the requirement to keep the on-demand instances running after changing from the respective initial on-demand launch configuration to the additional spot launch configuration, and the latter not necessarily being the case anymore due to the recently introduced Auto Scaling Termination Policies (for a change there hasn't been the usually fanfare via an accompanying AWS blog post), documented in Instance Termination Policy for Your Auto Scaling Group:

Before Auto Scaling selects an instance to terminate, it first
identifies the Availability Zone that has more instances than the
other Availability Zones used by the group. If all Availability Zones
have the same number of instances, it identifies a random Availability
Zone. Within the identified Availability Zone, Auto Scaling uses the
termination policy to select the instance for termination. [emphasis mine]

As outlined in How Your Termination Policy Works, you can now specify NewestInstance, if you want the last launched instance to be terminated, which would be one of the more recently launched spot instances:

Auto Scaling uses the instance launch time to identify the instance
that was launched last.

Obviously there might be a bit more to this, e.g. you can either specify any one of the policies as a standalone policy, or you can list multiple policies in an ordered list, but this approach should ensure the load of all instances being factored into the auto-scaling measurements and triggers; one caveat remains though:

Caveat

If the load balancer terminates one of the on-demand instances for any other reason (e.g. because it has become unhealthy in itself), it wouldn't be replaced by an on-demand instance automatically. So you'd need to monitor and account for this event separately, e.g. by temporarily activating the on-demand launch configuration again.

This makes sense - great detective work. There's still an outage risk, but it appears you've uncovered several new ways to reduce that risk. Hopefully someday we will have a simple checkbox for ASGs, "Instances at-or-below the server minimum are On-Demand instances." Thanks!
–
platformsNov 15 '12 at 18:09

The approach discussed above would be a little messy, and not so flexible. The more canonical approach is to just create 2 ASGs (one for spot, one for on-demand) and then register them both with the same ELB (discussed here). This gives you the ability to control each independently rather than trying to muck with LC swaps in a single ASG.