how netflix does microservices

25
How Netflix does Microservices ... Manuel Correa

Upload: manuel-correa

Post on 22-Jul-2015

89 views

Category:

Technology


6 download

TRANSCRIPT

How Netflix does Microservices ...

Manuel Correa

Microservices“Small Autonomous Services

that Work Together”Sam Newman

Microservices“Conway’s Law”

“Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.”

Microservices Principles

http://www.slideshare.net/spnewman/principles-of-microservices-ndc-2014

Modeled around Business Domain Culture of Automation Hide Implementation

Decentralize All Things

Design for Failure

Highly Observable

Deploy Independently

MicroServices

Culture of Automation- Immutable infrastructure in

AWS

Decentralize All ThingsHide Implementation Details

- Routing- Contracts- Resiliancy- Discovery- How services work togetherNodeJS

Ruby

Clojure

Free for all

Agree

Decentralize All ThingsSmart Endpoints and Dumb Pipes

- Dynamic Routing - Gateway for all Netflix services- Pluggable system that takes care of:

- Authorization and Authentication- Monitoring and tracking request- Load shedding- First level of resilience- Enables caching in the gateway level

Decentralize All ThingsSmart Endpoints and Dumb Pipes

Decentralize All thingsService Discovery

- Service Registry- Middle tier load balancing- Carries metadata of each service- Dynamic Service repository

Decentralize All ThingsDynamic Configuration

- Dynamic Typed Properties = Feature Flag System- Allow you to change properties on Runtime - Polling framework- Multiple sources (i.e.: Cassandra and DynamoDB)- Callbacks when the property changes

CB’s Zuul is using Archaius to change properties across AWS regions, HttpClient configurations and logging level

Design for Failure

- HTTP library- Load balancing on the client side- Retrys built-in - Caching- Request batching

Design for Failure

- Java Resilience library- Stop cascading failures- Fallback and gracefully degrade when possible- Realtime monitoring- Circuit breaker pattern

Design for Failure

● No Service has 100% SLA● 99.9930= 99.7% uptime● 0.3% of 1 billion requests = 300,000 failures● 2+ hours downtime/month even if all

dependencies have excellent uptime.

Service1

Service2

Service3 Fallback

Design for FailureCircuit Breaker pattern

Design for FailureHystrix Dashboard

Decentralized Architecture

DemoMay the demo Gods be with us...

/service/jobs

Client

/service/resumes

:9292

:9292

DemoMay the demo Gods be with us...

SERVICEClient Zu

ul

Hys

trix

Rib

bon

Fallback Backup Service

FallbackCache

:9090

:9292

:9393

Design for Failure

- Testing resiliency in Production- Chaos Monkey => Kill instances randomly- Latency Monkey => Induce latency in services- Chaos Gorilla => Simulates AZ and regions down- Conformity Monkey => Make sure instances follow good

practices

Highly Observable

- Hystrix Stream aggregator

- AWS Change Tracker

- AWS Usage Tracker

Take Aways

http://www.slideshare.net/spnewman/principles-of-microservices-ndc-2014

Modeled around Business Domain Culture of Automation Hide Implementation

Decentralize All Things

Design for Failure

Highly Observable

Deploy Independently

MicroServices

Take Aways- Each Service must have a fallback strategy by

design- Routing layer is essential for the architecture- To make Services work together, there is a

need for a highly reliable infrastructure around the MicroServices

Take Aways“Conway’s Law”

“Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.”