Friday, March 03, 2017

cloud: AWS typo to bring down internet

Summary of the Amazon S3 Service Disruption in the Northern Virginia (US-EAST-1) Region"an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended... Removing a significant portion of the capacity caused each of these systems to require a full restart. While these subsystems were being restarted, S3 was unable to service requests..."