Lord of chaos

Becoming a Chaos Engineer

BEKK Radar

The simple act of logging

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
Application
|> Loggingframework
|> Local RabbitMQ instance
|> Federated downstream RabbitMQ instance
|> Logstash Instance One
|> File ( |> Backup on fileshare )
|> Logstash Instance Two
|> ElasticSearch
|> Kibana

Distributed systems are complex

DON'T ROLL YOUR OWN!

Putting components together is hard

Kafka Bug

https://aphyr.com/posts/293-call-me-maybe-kafka
https://people.eecs.berkeley.edu/~palvaro/molly.pdf
http://blog.empathybox.com/post/62279088548/a-few-notes-on-kafka-and-jepsen

CAP Theorem

  • Consistency
  • Availability
  • Partition Tolerance

Even Google goes down!

Google Compute Engine Incident

What is Chaos Engineering

Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production.

Embrace Chaos

http://principlesofchaos.org/

Principles

  1. Define ‘steady state’
  2. Hypothesize that this state continues during failure
  3. Introduce failures
  4. Try to disprove hypothesis

Advanced Principles

  1. Build a Hypothesis around Steady State Behavior
  2. Vary Real-world Events
  3. Run Experiments in Production
  4. Automate Experiments to Run Continuously

Lineage-driven Fault Injection

Lineage-driven Fault Injection Paper

QCon 2016: Monkeys in Lab Coats: Applying Failure Testing Research @Netflix

Chaos Engineering in the wild

Netflix - The Simian Army

https://github.com/Netflix/SimianArmy

Microsoft - Azure Search

https://azure.microsoft.com/nb-no/blog/inside-azure-search-chaos-engineering/

WazMonkey

https://github.com/smarx/WazMonkey

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
// choose one at random
var instance = instances[new Random().Next(instances.Length)];

// reboot it
req = HttpWebRequest.Create("CloudProviderApi.com", ...);
req.Method = "POST";
req.ContentLength = 0;
req.Headers["x-ms-version"] = "2012-03-01";
req.ClientCertificates.Add(cert);

// make sure the response was "accepted"
var response = (HttpWebResponse)req.GetResponse();

Jepsen

https://aphyr.com/tags/jepsen

How to start

Predictable outcomes

1: 
2: 
3: 
let yourApplication request =
    Process request
    |> response

Selective test cases

Let the Lord of Chaos Rule

Questions?


twitter: @nikolaiii
slides: https://nikolaia.github.io/lord-of-chaos-slides/