Netflix's Chaos Monkey [1] and the Simian army [2] are great projects for fault injection testing. I am interested in developing a version of chaos-gorilla, taking down whole AZ or region.
How should I approach it? AWS offers API to terminate instances and restart it but it's hard to automate completely. Any ideas will be appreciated.
[1] https://github.com/Netflix/chaosmonkey
[2] https://medium.com/netflix-techblog/the-netflix-simian-army-16e57fbab116