going cloud native with ibm cloud and netflixoss for dev@pulse

Going cloud native for your applications and services

Jerry CuomoAndrew Spyker

Jerry is going to cover– Our Journey to Cloud Services– Stop along the way, Winning Netflix Cloud Prize– Our Goals in 2014 in delivering Cloud Services

Andrew is going to– Describe “Xen, Methodology, Approach” to

building world-class services– Highlighting new capabilities to support this

methodology, running on IBM Cloud– Prove this by example

Topics

@aspyker

@JerryCuomo

Our Journey to Cloud Services• From my blog– http://bit.ly/cuomoblog

• In 2014, we will continue driving oursoftware to the cloud. To complement our packaged software business, we are transforming our development operations to also deliver our wares as self service cloud-native offerings within the IBM Cloud (SoftLayer, Bluemix, PureApp).

• You know you have a cloud service if it is addressable via URL, has Ts&Cs, and has an operations team running it 24x7x365.

Acme Air and winningthe Netflix Cloud Prize

• Acme Air– Cloud and Mobile Sample and Benchmark

• Acme Air + NetflixOSS + IBM SoftLayer– IBM SoftLayer Port to embrace NetflixOSS platform– Winner: Best Example Mash-Up Application Category

Cloud Services Goals• We will follow the “Zen” of operating cloud services

• “We will rule the cloud, the cloud will not rule us”– Proactive on failure and security testing and auto recovery

• Move from reactive model to predictive model– We are always watching and anticipating

• Scalable service fabric services, ops excellence team– Tools, libraries, services, and practices and COE for cloud

• Focus on key areas including– Elastic and Web Scale– High Availability and Automatic Recovery– High Velocity Continuous Delivery

Elastic and Web Scale

Doing This

Not Doing That

Source: Programmableweb.com 2012

Elastic and Web Scale

…Front end API

(browser and mobile)

AuthenticationService

BookingService

Temporalcaching

DurableStorage

LoadBalancers

…… …

Strategy Benefit

Make deployments automated Without automation impossible

Expose well designed API to users Offloads presentation complexity to clients

Remove state for mid tier services Allows easy elastic scale out

Push temporal state to client and caching tier Leverage clients, avoids data tier overload

Use partitioned data storage Data design and storage scales with HA

HA and Automatic Recovery

Feeling This

Not Feeling That

Micro serviceImplementation

Call “Auth Service”

Highly Available Service Runtime Recipe

Ribbon REST clientwith Eureka

Web AppFront End

(REST services)App Service

(auth-service)

Executeauth-service

call

Hys

trix

EurekaServer(s)

EurekaServer(s)

EurekaServer(s)

Karyon

FallbackImplementation

Implementation Detail Benefits

Decompose into micro services • Key user path always available• Failure does not propagate across service boundaries

Karyon /w automatic Eureka registration • New instances are quickly found• Failing individual instances disappear

Ribbon client with Eureka awareness • Load balances & retries across instances with “smarts”• Handles temporal instance failure

Hystrix as dependency circuit breaker • Allows for fast failure• Provides graceful cross service degradation/recovery

IaaS High Availability

Region (Dallas)

DAL01

Datacenter (DAL06)DAL05

… … …

…

Eureka…

Local LBs

Web App Auth Service Booking Service

Cluster Auto Recovery and Scaling Services

……

… ……

……

… ……Global LoadBalancers …

Rule Why?

Always > 2 of everything 1 is SPOF, 2 doesn’t web scale and slow DR recovery

Including IaaS and cloud services You’re only as strong as your weakest dependency

Use auto scaler/recovery monitoring Clusters guarantee availability and service latency

Use application level health checks Instance on the network != healthy

DEMO TIME!

Let’s prove it

• What is you lost a random instance?

• What if you lost a whole datacenter?

Demonstrated as partof Netflix Cloud prize

bit.ly/noss-sl-blog

DEMO Overview

Region (Dallas)

DAL06

Datacenter (DAL05)

DAL01

… … …

…

Eureka…

Local LBs



…

…… ……

…

…… ……

Global LoadBalancers … ✗Chaos Gorilla

DEMO Success!

Region (Dallas)

DAL06

Datacenter (DAL05)

DAL01

… … …

…

Eureka…

Local LBs



…

…… ……

…

…… ……

Global LoadBalancers …

Chaos Gorilla✗

Online Video(shows recovery as well)

http://bit.ly/sl-gorillavid

Continuous Delivery

Reading This

Not This

ContinuousDelivery

… …v

Cluster v1 Canary v2 Cluster V2

Step Technology

Developers test locally Unit test frameworks

Continuous build Continuous build server based on gradle builds

Build “bakes” full instance image Imaginator (Aminator inspired) creates SoftLayer images

Developer work across dev and test Archaius allows for environment based context

Developers do canary tests, red/black deployments in prod

Asgard console provides app cluster common devops approach, security patterns, and visibility

ContinuousBuild Server Baked to SoftLayer

Image Templates

More details?

• PAS-1418A - Porting the Netflix OSS Cloud Architecture to SoftLayer– Today - 5:00 – 6:00, Room 116

• All code available on Github–netflix.github.io– github.com/EmergingTechnologyInstitute–Blog - iSpyker.blogspot.com– Twitter - @aspyker