mircoservices, dev ops and engineering best practices at wix.com

73
@aviranm Aviran Mordo Head of Microservices, DevOps & Engineering at Wix.com www.linkedin.com/in/avir an @aviranm http:// www.aviransplace.com

Upload: aviran-mordo

Post on 08-Apr-2017

469 views

Category:

Technology


7 download

TRANSCRIPT

Page 1: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Aviran MordoHead of

Microservices, DevOps & Engineering at Wix.com

www.linkedin.com/in/aviran

@aviranm

http://www.aviransplace.com

Page 2: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Page 3: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Wix In NumbersOver 87M users

Static storage is >2Pb of data

3 data centers + 2 clouds (Google, Amazon)

2B HTTP requests/day

1200 people work at Wix

Page 4: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Over 200 Microservices on Production

Page 5: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Microservices - What Does it Take

Page 6: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

How to Get There? (Wix’s journey)

http://gpstrackit.com/wp-content/uploads/2013/11/VanishingPointwRoadSigns.jpg

Page 7: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranmhttp://p1.pichost.me/i/11/1339236.jpg

In the Beginning of Time

About 6 years ago

Page 8: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

The Monolithic GiantOne monolithic server that handled everything

Dependency between features

Changes in unrelated areas caused deployment of the whole system

Failure in unrelated areas will cause system wide downtimeLighttpd

(file serving) MySQLDB

Wix(Tomcat)

Page 9: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Breaking the System Apart

https://upload.wikimedia.org/wikipedia/commons/6/67/Broken_glass.jpg

Page 10: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Page 11: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Concerns and SLA

Many feature request

Lower performance requirement

Lower availability requirement

Write intensive

Edit websites

Not many product changes

High performance

High availability

Read intensive

View sites, created by Wix editor

Page 12: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Mono-Wix

Phase 1

Page 13: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Extract Public Service

Editor service (Mono-Wix) Public service

Page 14: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Divide and Conquer

Editor service Public service

Guideline: No runtime, deployment or data dependency

Page 15: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Separation by Product LifecycleDecouple architecture => Decouple teams

Deployment independence

Areas with frequent changes

Editor service Public service

Page 16: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Separation by Service LevelScale independently Use different data storeOptimize data per use case (Read vs Write)Run on different datacenters / clouds / zonesSystem resiliency (degradation of service vs. downtime)Faster recovery time

Editor service Public service

Page 17: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranmhttp://blogs.adobe.com/captivate/2011/03/training-adding-interactivity-to-elearning-courses-with-adobe-captivate-5.html/time-to-learn-clock

Page 18: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Service Boundary

Page 19: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Separation of DatabasesCopy data between segments

Optimize data per use case (read vs. write intensive)

Different data stores

Public serviceEditor serviceCopy necessary data

Page 20: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Serialization

Page 21: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Serialization / ProtocolBinary?JSON / XML / Text?HTTP?

Public serviceEditor service ?

Page 22: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Serialization / Protocol - TradeoffsReadability?

Performance?Debug?Tools?Monitoring?Dependency?

Public serviceEditor service ?

Page 23: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

API Transport/Protocol

Page 24: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

How to Expose an APIREST?RPC?SOAP?

Public serviceEditor service ?

Page 25: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Wix’s Choices RESTHTTP

Public serviceEditor service

BinaryJSON-RPCHTTP

Page 26: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

API Versioning

Page 27: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

API Versioning

Public serviceEditor serviceBackward compatibility

Maybe here

API Schema /v1/v2YAGNI

Page 28: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Service Discovery

Page 29: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Service Discovery

Public serviceEditor serviceConfiguration (DNS+LB)

Zookeeper?Consul?Etcd?Eureka?

YAGNI

Page 30: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Resilience

Page 31: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

What does the Arrow Mean?

Public serviceEditor service

Failure Point

Every Network Hop Reduces Availability

Page 32: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Failure Points = Network I/O

Public serviceEditor service

Failure Point

Retry policyCircuit breakerThrottlers

Be careful – you may cause downtime

Retry only on idempotent operations

Page 33: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Degradation of Service

Public serviceEditor service

Failure Point

Feature killer (Killer feature)FallbacksSelf healing

Page 34: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Testing

Page 35: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Test a Distributed System (at Wix)

Public serviceEditor service

Unit TestIntegration TestServer E2EAutomation

Client

Page 36: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Distributed Logging

Page 37: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Build visibility into service

Page 38: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

What is the Right Size of a Microservice?

Page 39: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

The Size of a Microservice is the Size of the Team That is

Building it.

“Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations”Conway, Melvin

Page 40: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

How many Engineers can Work on a Single Project ?

http://cdn.wp.sunmotors.co.uk/get/2014/03/cars.28.620x413.jpg

Page 41: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Microservices = Small TeamsSmall Teams = Small Rooms

@aviranm

Page 42: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Ownership

Page 43: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Team Work

Microservice is owned by a team

You build it – you run it

No microservice is left without a clear owner

Microservice is NOT a library – it is a live production system

Page 44: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Culture that ROQS

ROQS

esponsibility

wnership

uality

haring@aviranm

Page 45: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

What Is The Common Denominator? Product manager

Project manager

QA

Operations

DBA

Develperscan dothese jobs

@aviranm

Page 46: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Developer

Product

QA

Management

Operation

BI

Dev Centric Culture – Involve The Developer Product definition (with product)

Development

Testing (with QA engineers)

Deployment / Rollback (with operations)

Monitoring / BI (with BI team)

DevOps – to enable deployment and rollback, fully automated

Support Circle

@aviranm

Page 47: Mircoservices, dev ops and Engineering best practices at Wix.com

What did you Learn from Just 2 Services

● Service boundary● Monitoring infrastructure● Serialization format● Synchronous communication protocol (HTTP/Binary)● Asynchronous (queuing infra)● Service SLA● API definition (REST/ RPC / Versioning)● Data separation● Deployment strategy● Testing infrastructure (integration test, e2e test)● Compatibility (backwards / forward)

Page 48: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Continue to Extract More Microservices

Page 49: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

HTML Editor

Flash Editor

MSM

Private Media

Public Media

Editor Segment Public Segment

Premium Services

List DB

App Builder

App Store

App Market

Dashboard

Mailer

TimeZone

Public HTML API

Public API (Flash)

MSP

Public Server

HTML Renderer

HTML SEO Renderer

Flash Renderer

Flash SEO Renderer

Sitemap Renderer

Robots.txt Renderer

User Server

Template Viewer

ContactsHUBActivity

Site Members

Store MgrComments

Snapshoter

User Pref

Feed Me

Shout-out Hotels

PETRI

Site Pref

Dist LoggerSlicer

eCom Renderer

eCom Cart

eCom Checkout

eCom Catalog

eCom Orders

Payment Facade

Account Info

HTML API

HTML Embeder

BlogMobile

Mostly writes

2 Data centers

Db active-standby (preferably active-active)

Performance < 2s 99%

Serves mostly site builders

Uptime > 99.9

Mostly reads

>2 Data centers

Db active-active(-active)

Performance < 500ms 99%

Serves mostly site viewers

Uptime > 99.99

Page 50: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

When to Extract a New Microservice

Page 51: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Microservice or Library?

Do I create deployment dependency?

What is DevOps overhead (managing middleware) ?

Who owns it?

Does it have its own development lifecycle?

Does it fit the scalability / availability concerns?

Can a different team develop it?

I need time zone from an IP address

Page 52: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Microservice has Ops, Library is Only Computational

Page 53: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Which Technology Stack to Use

Page 54: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Free to Chose?Microservices gives the freedom to use a different technology stacks.Enables innovation

Page 55: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Default to the Stack You Know how to Operate.

Page 56: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Innovate on Non Critical Microservices and Take Full Responsibility for its Operation.

Page 57: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranmhttp://wallpaperbeta.com/dogs_kiss_noses_animals_hd-wallpaper-242054/

Page 58: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Why Microservices

Scale engineering

Development Velocity

Scale system ?

Page 59: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Microservices is the First Post DevOps Architecture

Page 60: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Every Microservice is a Overhead

Page 61: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Wix’s Company Model

Company

Guild

Guild

Company

Gang GangGang Gang

Product Product

Cross-Engineering Teams

A Guild is a group of people that share expertise, knowledge, tools, code and practice

Company focus on large segmentHas all the resources it needs to be independentPeople within the company are aligned with the Guilds

Page 62: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Production State Changes Every 5 min

Page 63: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm @aviranm

Page 64: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm @aviranm

Quality = Better Engineers

Better Engineers = Professional Growth

Professional Growth = Investment in People

Page 65: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Guild Day

Engineers work 4 days with their company

Thursday is Guild day.

Developers conduct quality enhancing activities with the Guild.

Page 66: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Guild Day Goals Builds cross-team relationships

Shares knowledge

Assimilate the culture

Lesson learned

Continuous improvement

Promotes innovation

Professional development@aviranm

Page 67: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Guild Day Schedule 10:00-11:00 Open Space11:00-11:15 Break11:15-11:40 Project spotlight11:45-13:00 Tech talk or Workshop

@aviranm

Page 68: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Open Space

Page 69: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Post MortemAny production downtime is investigatedUnderstand the causeLesson learnedAI to avoid same issue in the futureFindings are published in dapulse (public forum)

Page 70: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Guild Week – Games of GangsOne week each quarterPair programming with a person from another company

Enhancing infrastructureBuilding toolsHelping companiesWork on open sourceGoal #1 – Improve engineering skills and

quality

Page 71: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

FastFeatures

BetterQuality

Page 72: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Thank You

Page 73: Mircoservices, dev ops and Engineering best practices at Wix.com

@aviranm

Q&A@aviranm

http://www.aviransplace.com

www.linkedin.com/in/aviran

Aviran MordoHead of Engineering

http://engineering.wix.com

@WixEng