RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX

Devx Blog Agile Image

Amazon Releases Its Own Chaos Gorilla

Posted by Jason Bloomberg on Aug 27, 2013

Millions of Netizens were forced to go outside and get some fresh air for an hour Sunday when Amazon Web Services (AWS) experienced a brief outage, taking down sites such as Instagram and Vine, among others. The downtime only affected the North Virginia U.S.-EAST datacenter, meaning that any Cloud-based company that actually followed Amazon’s own recommendations (as well as the recommendations of any Cloud consultant worth his or her salt, including yours truly) was unaffected.

Why? Because the Cloud is not built to avoid failure. It’s built to work around failure and recover automatically from failure. If you followed the recommendations and geographically distributed your instances, along with implementing a proper Cloud architecture for delivering basic availability, then your service would have remained standing.

Netflix, for one, kept perking along, because Netflix follows this recommendation. In fact, Netflix tests their deployment on a regular basis via their Simian Army – a collection of processes and applications that routinely wreak havoc on their production environment in order to test whether they’ve done things properly.

In this instance, the simian in question is the Chaos Gorilla – an application that takes down an entire AWS Availability Zone supporting the Netflix deployment. What Netflix runs on purpose, Amazon deployed accidentally – or at least, we can presume it was accidental. But maybe they should have taken down a data center on purpose, essentially running their own Chaos Gorilla. How else will AWS customers know they’ve properly architected their Cloud-based apps?

Recent Entries

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date