voxxed vienna 2015 fault tolerant microservices

65
@chbatey #Voxxed Fault tolerant microservices Christopher Batey DataStax

Upload: christopher-batey

Post on 19-Jul-2015

411 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey#Voxxed

Fault tolerant microservicesChristopher Batey

DataStax

Page 2: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Who am I?•DataStax-Technical Evangelist / Software Engineer- Builds enterprise ready version of Apache

Cassandra • Sky: Building next generation Internet TV

platform• Lots of time working on a test double for

Apache Cassandra

Page 3: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Agenda•Setting the scene•What do we mean by a fault?•What is a micro(ish)service?•Monolith application vs the micro(ish)service•A worked example•Identify an issue•Reproduce/test it•Show how to deal with the issue

Page 4: Voxxed Vienna 2015 Fault tolerant microservices

So… what do applications look like?

Page 5: Voxxed Vienna 2015 Fault tolerant microservices

So... what do systems look like now?

Page 6: Voxxed Vienna 2015 Fault tolerant microservices

But different things go wrong...

down

slow network

slow app

SLA: 2 second max

missing packets

GC :(

Page 7: Voxxed Vienna 2015 Fault tolerant microservices

PinService

Movie Player

UserService

DeviceService

Play Movie

Example: Movie player service

Page 8: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Time for an example...•All examples are on github•Technologies used:•Dropwizard•Spring Boot•Wiremock•Hystrix•Graphite•Saboteur

Page 9: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Testing microservices• You don’t know a service is fault tolerant

if you don’t test faults

Page 10: Voxxed Vienna 2015 Fault tolerant microservices

Isolated service tests

Movie serviceMocks User

Device Pin service

Play Movie AcceptanceTest

Prime

Real HTTP/TCP

Page 11: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Fault tolerance1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 12: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

1 - Don’t take forever• If at first you don’t succeed, don’t take forever to tell someone• Timeout and fail fast

Page 13: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Which timeouts?• Socket connection timeout• Socket read timeout

Page 14: Voxxed Vienna 2015 Fault tolerant microservices

Your service hung for 30 seconds :(

Customer

You :(

Page 15: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Which timeouts?• Socket connection timeout• Socket read timeout•Resource acquisition

Page 16: Voxxed Vienna 2015 Fault tolerant microservices

Your service hung for 10 minutes :(

Page 17: Voxxed Vienna 2015 Fault tolerant microservices

Let’s think about this

Page 18: Voxxed Vienna 2015 Fault tolerant microservices

A little more detail

Page 19: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Wiremock + Saboteur + Vagrant•Vagrant - launches + provisions local VMs•Saboteur - uses tc, iptables to simulate network issues•Wiremock - used to mock HTTP dependencies•Cucumber - acceptance tests

Page 20: Voxxed Vienna 2015 Fault tolerant microservices

I can write an automated test for that?

Wiremock:•User Service•Device Service•Pin Service

Saboteur

Vagrant + Virtual box VM

MovieService

AcceptanceTest

prime to drop traffic

reset

Page 21: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Implementing reliable timeouts• Protect the container thread!•Homemade: Worker Queue + Thread pool (executor)

Page 22: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Implementing reliable timeouts• Protect the container thread!•Homemade: Worker Queue + Thread pool (executor)•Hystrix• Spring cloud Netflix

Page 23: Voxxed Vienna 2015 Fault tolerant microservices

A simple Spring RestController

@RestControllerpublic class Resource { private static final Logger LOGGER = LoggerFactory.getLogger(Resource.class); @Autowired private ScaryDependency scaryDependency; @RequestMapping("/scary") public String callTheScaryDependency() { LOGGER.info("Resource later: I wonder which thread I am on!"); return scaryDependency.getScaryString(); }}

Page 24: Voxxed Vienna 2015 Fault tolerant microservices

Scary dependency

@Componentpublic class ScaryDependency { private static final Logger LOGGER = LoggerFactory.getLogger(ScaryDependency.class); public String getScaryString() { LOGGER.info("Scary Dependency: I wonder which thread I am on! Tomcats?”); if (System.currentTimeMillis() % 2 == 0) { return "Scary String"; } else { Thread.sleep(5000) return “Slow Scary String"; } }}

Page 25: Voxxed Vienna 2015 Fault tolerant microservices

All on the tomcat thread13:47:20.200 [http-8080-exec-1] INFO info.batey.examples.Resource - Resource later: I wonder which thread I am on!13:47:20.200 [http-8080-exec-1] INFO info.batey.examples.ScaryDependency - Scary Dependency: I wonder which thread I am on! Tomcats?

Page 26: Voxxed Vienna 2015 Fault tolerant microservices

Scary dependency@Componentpublic class ScaryDependency { private static final Logger LOGGER = LoggerFactory.getLogger(ScaryDependency.class); @HystrixCommand() public String getScaryString() { LOGGER.info("Scary Dependency: I wonder which thread I am on! Tomcats?”); if (System.currentTimeMillis() % 2 == 0) { return "Scary String"; } else { Thread.sleep(5000) return “Slow Scary String"; } }}

Page 27: Voxxed Vienna 2015 Fault tolerant microservices

What an annotation can do...13:51:21.513 [http-8080-exec-1] INFO info.batey.examples.Resource - Resource later: I wonder which thread I am on!13:51:21.614 [hystrix-ScaryDependency-1] INFO info.batey.examples.ScaryDependency - Scary Dependency: I wonder which thread I am on! Tomcats? :P

Page 28: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Timeouts take home● You can’t use network level timeouts for SLAs● Test your SLAs - if someone says you can’t, hit them with a stick● Scary things happen without network issues

Page 29: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Fault tolerance1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 30: Voxxed Vienna 2015 Fault tolerant microservices

2 - Don’t try if you can’t succeed

Page 31: Voxxed Vienna 2015 Fault tolerant microservices

Complexity

“When an application grows in complexity it will eventually start sending emails”

Page 32: Voxxed Vienna 2015 Fault tolerant microservices

Complexity

“When an application grows in complexity it will eventually start using queues and thread pools”

Or use Akka :)

Page 33: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Don’t try if you can’t succeed

Page 34: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Don’t try if you can’t succeed• Executor Unbounded queues :(• newFixedThreadPool• newSingleThreadExecutor• newThreadCachedThreadPool• Bound your queues and threads• Fail quickly when the queue / maxPoolSize is met• Know your drivers

Page 35: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

This is a functional requirement•Set the timeout very high•Use Wiremock to add a large delay to the requests

Page 36: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

This is a functional requirement•Set the timeout very high•Use Wiremock to add a large delay to the requests•Set queue size and thread pool size to 1•Send in 2 requests to use the thread and fill the queue•What happens on the 3rd request?

Page 37: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Fault tolerance1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 38: Voxxed Vienna 2015 Fault tolerant microservices

3 - Fail gracefully

Page 39: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Expect rubbish•Expect invalid HTTP•Expect malformed response bodies•Expect connection failures•Expect huge / tiny responses

Page 40: Voxxed Vienna 2015 Fault tolerant microservices

Testing with WiremockstubFor(get(urlEqualTo("/dependencyPath"))

.willReturn(aResponse()

.withFault(Fault.MALFORMED_RESPONSE_CHUNK)));

{ "request": { "method": "GET", "url": "/fault" }, "response": { "fault": "RANDOM_DATA_THEN_CLOSE" }

{ "request": { "method": "GET", "url": "/fault" }, "response": { "fault": "EMPTY_RESPONSE" } }

Page 41: Voxxed Vienna 2015 Fault tolerant microservices

Stubbed Cassandra

Page 42: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Fault tolerance1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 43: Voxxed Vienna 2015 Fault tolerant microservices

4 - Know if it’s your fault

Page 44: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Record stuff•Metrics: - Timings - Errors- Concurrent incoming requests- Thread pool statistics- Connection pool statistics•Logging: Boundary logging, ElasticSearch / Logstash•Request identifiers

Page 45: Voxxed Vienna 2015 Fault tolerant microservices

Graphite + Codahale

Page 46: Voxxed Vienna 2015 Fault tolerant microservices

Response times

Page 47: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Separate resource pools•Don’t flood your dependencies•Be able to answer the questions:-How many connections will you make to dependency X?-Are you getting close to your max connections?

Page 48: Voxxed Vienna 2015 Fault tolerant microservices

So easy with Dropwizard + Hystrix

metrics:

reporters:

- type: graphite

host: 192.168.10.120

port: 2003

prefix: shiny_app

@Overridepublic void initialize(Bootstrap<AppConfig> appConfigBootstrap) { HystrixCodaHaleMetricsPublisher metricsPublisher = new HystrixCodaHaleMetricsPublisher(appConfigBootstrap.getMetricRegistry()); HystrixPlugins.getInstance().registerMetricsPublisher(metricsPublisher);}

Page 49: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Fault tolerance1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 50: Voxxed Vienna 2015 Fault tolerant microservices

PinService

Movie Player

UserService

DeviceService

Play Movie

5 - Don’t whack a dead horse

Page 51: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

What to do…•Yes this will happen…•Mandatory dependency - fail *really* fast•Throttling•Fallbacks

Page 52: Voxxed Vienna 2015 Fault tolerant microservices

Circuit breaker pattern

Page 53: Voxxed Vienna 2015 Fault tolerant microservices

Implementation with Hystrix

@Path("integrate") public class IntegrationResource { private static final Logger LOGGER = LoggerFactory.getLogger(IntegrationResource.class); @GET @Timed public String integrate() { LOGGER.info("integrate"); String user = new UserServiceDependency(userService).execute(); String device = new DeviceServiceDependency(deviceService).execute(); Boolean pinCheck = new PinCheckDependency(pinService).execute(); return String.format("[User info: %s] \n[Device info: %s] \n[Pin check: %s] \n", user, device, pinCheck); }}

Page 54: Voxxed Vienna 2015 Fault tolerant microservices

Implementation with Hystrixpublic class PinCheckDependency extends HystrixCommand<Boolean> { private HttpClient httpClient; public PinCheckDependency(HttpClient httpClient) { super(HystrixCommandGroupKey.Factory.asKey("PinCheckService")); this.httpClient = httpClient; } @Override protected Boolean run() throws Exception { HttpGet pinCheck = new HttpGet("http://localhost:9090/pincheck"); HttpResponse pinCheckResponse = httpClient.execute(pinCheck); int statusCode = pinCheckResponse.getStatusLine().getStatusCode(); if (statusCode != 200) { throw new RuntimeException("Oh dear no pin check, status code " + statusCode); } String pinCheckInfo = EntityUtils.toString(pinCheckResponse.getEntity()); return Boolean.valueOf(pinCheckInfo); }}

Page 55: Voxxed Vienna 2015 Fault tolerant microservices

Implementation with Hystrixpublic class PinCheckDependency extends HystrixCommand<Boolean> { private HttpClient httpClient; public PinCheckDependency(HttpClient httpClient) { super(HystrixCommandGroupKey.Factory.asKey("PinCheckService")); this.httpClient = httpClient; } @Override protected Boolean run() throws Exception { HttpGet pinCheck = new HttpGet("http://localhost:9090/pincheck"); HttpResponse pinCheckResponse = httpClient.execute(pinCheck); int statusCode = pinCheckResponse.getStatusLine().getStatusCode(); if (statusCode != 200) { throw new RuntimeException("Oh dear no pin check, status code " + statusCode); } String pinCheckInfo = EntityUtils.toString(pinCheckResponse.getEntity()); return Boolean.valueOf(pinCheckInfo); } @Override public Boolean getFallback() { return true; }}

Page 56: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Triggering the fallback•Error threshold percentage•Bucket of time for the percentage•Minimum number of requests to trigger•Time before trying a request again•Disable•Per instance statistics

Page 57: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Fault tolerance1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 58: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

6 - Turn off broken stuff• The kill switch

Page 59: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

To recap1.Don’t take forever - Timeouts2.Don’t try if you can’t succeed 3.Fail gracefully 4.Know if it’s your fault5.Don’t whack a dead horse6.Turn broken stuff off

Page 60: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Links• Examples:- https://github.com/chbatey/spring-cloud-example- https://github.com/chbatey/dropwizard-hystrix- https://github.com/chbatey/vagrant-wiremock-saboteur• Tech:- https://github.com/Netflix/Hystrix- https://www.vagrantup.com/- http://wiremock.org/- https://github.com/tomakehurst/saboteur

Page 61: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Questions?

Thanks for listening!Questions: @chbatey

http://christopher-batey.blogspot.co.uk/

Page 62: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Developer takeaways● Learn about TCP● Love vagrant, docker etc to enable testing● Don’t trust libraries

Page 63: Voxxed Vienna 2015 Fault tolerant microservices

Hystrix cost - do this yourself

Page 64: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

Hystrix metrics● Failure count● Percentiles from Hystrix point of view● Error percentages

Page 65: Voxxed Vienna 2015 Fault tolerant microservices

@chbatey

How to test metric publishing?● Stub out graphite and verify calls?● Programmatically call graphite and verify numbers?● Make metrics + logs part of the story demo