gilt from monolith ruby app to microservice scala service architecture
DESCRIPTION
The presentation that I gave at the 'NYC Tech Talks' meetup @ January 14, 2014TRANSCRIPT
SCALING GILTFrom Monolith Ruby App
to
Distributed Scala Micro-Services
#NYCTECHTALKS
Lead Engineer - GILT Podium => http://bit.ly/podiumapp
Yoni (Jonathan) Goldberg
ABOUT ME- Leading the Popeye Team
- Sale Personalization, Loyalty, SEO Post-purchase, Login/Registration flows
- MIT CS BS/Meng | Google | IBM | IDF
- Brooklyn | Coffee | Arduino | Running | Kite Surfing |Online Collaboration | Poker
Excited to be part of the NYC Tech community
THE LESSONS AND CHALLENGES THATWE HAD/HAVE WITH
MICRO-SERVICE ARCHITECTURE
Flash Sales Business Founded in 2007
Top 50 Internet-Retailer~150 Engineers
WHAT IS GILT?
ANOTHER WAY TO LOOK AT GILT
Three day traffic pattern
THE CLASSICSTARTUP STORY
THE EARLY DAYS2007 - Ruby on Rails the hottest new thing
The goal was to get to market fast
WE WERE ABLE TO HANDLE OURTRAFFIC PRETTY WELL
UNTIL LOUBOUTIN CAME TO GILT
TECHNOLOGY PAIN POINTS - 2009Spike required to launch 1,000s of ruby processesPostgres was overloadedRouting traffic between ruby processes sucked
|Note to self| - hide from the ruby fan boys
DEV PAIN POINTS1000 Models/Controllers, 200K LOC, 100s of jobsLots of contributors + no ownershipDifficult deployments with long integration cyclesHard to identify root causes
WE NEEDED TO SOLVETHE PROBLEM FAST
THREE THINGS HAPPENEDStarted the transition to the JVMM(a/i)cro-Service Era StartedDedicated data stores
WHY JVM?Widely adoptedStableBetter support for concurrencyBetter GC vs MRI
FIRST 10 SERVICES
We solved 90% of our arch scaling problemBut not the Dev points
PAIN POINTSSpike required to launch 1,000s of ruby processesPostgres was overloadedRouting traffic between ruby processes suckedNew services became semi-monolithic1000 Models/Controllers, 200K LOC, 100s of jobsLots of contributors + no ownershipDifficult deployments with long integration cycles
WHY WE DOUBLED DOWN ON MICRO-SERVICES
Empower teams and ownershipSmaller scopeSimpler and Easier deployments and rollbacks
MICRO SERVICE ARCHITECTURESTARTED TO GET TRACTION
AS OF LAST WEEK WE HAVE MORETHAN
450 SERVICES
APP BOOTSTRAPrake bootstrap:admin-web # Bootstrap a admin-web service rake bootstrap:babylon-docs # Bootstrap a babylon-docs service rake bootstrap:client-server-core # Bootstrap a client-server-core service rake bootstrap:jersey-java # Bootstrap a jersey-java service rake bootstrap:jersey-scala # Bootstrap a jersey-scala service rake bootstrap:play # Bootstrap a play service rake bootstrap:play-ui-build # Bootstrap a play-ui-build service rake bootstrap:sbt-library # Bootstrap a sbt-library service rake bootstrap:schema # Bootstrap a schema service
WE BEGAN THE TRANSITION TO SCALAAND PLAY
LOSA - Lots Of Small AppsSame motivation and benefits of Micro-Service
Architecture
NEW CHALLENGESDev/Integration EnvironmentsWho owns this service!?MonitoringDeployments and Testing (Functional/Integration)
ON DEV/INTEGRATION ENVIRONMENTSThe hardware is not strong enoughNo one wants to compile 20 services
EACH TEAM HAS A STAGING ENVSERVICE_PORTS=[ 4001, #listing-service 8235, #svc-user-set 9420, #svc-free-fall 7895, #svc-Loyalty 8155, #web-loyalty 9410, #web inventory status 7898, #admin-loyalty 7899, #notification 7102, #rouge 9530, #svc-component 6802, #svc-waitlist-submit 4066, #svc-action-sale ....
PORT_FORWARD_ARGS=SERVICE_PORTS.map { |port| ['-L', "#{port}:localhost:#{port}"] }
exec(*[%w{ssh -a -C -N -n}, PORT_FORWARD_ARGS, GW_HOST].flatten)
STAGING DIFFICULTIES:Hard to keep all the services up to dateMaxed our staging env capacitiesRequires to have internet connection for some of theservices (e.g LOSA-apps)
The Future
DOCKERAn extension to Linux Containers (LXC)
DecentralizationSimple ConfigurationsMuch lighter than a VMImmutableSupports services and platforms
ON OWNERSHIP "code stays much longer than people" - SB
CODE OWNERSHIP
CURRENT APPROACHCode Review!Code Review!Code Review!Team owns services, not individual developersOwnership transfer
DATA OWNERSHIP
WE TRANSITIONED TO MICRO-DBSThird of the services have their own
MongoDB
Postgres
Voldemort
MANAGE MICRO-RELATIONAL DBS SCHEMA EVOLUTION MANAGER
https://github.com/gilt/schema-evolution-manager
PRINCIPLES OF SCHEMA EVOLUTION MANAGER
Can manage the schema evolutions in a Git repoSchema changes are deployed as tar fliesNo rollbacksSchema changes are required to be incremental
echo "create table releases (id integer)" > new.sql sem-add ./new.sql #Created a git commit sem-dist # generates the tar e.g schema-ion-cannon0.0.2.tar.gz # Scp and untar on your server cd schema-ion-cannon-0.0.2 sem-apply host --localhost --name ion-cannon --user ion-cannon
ON MONITORING
THE TOOLS WE USE
graphite / openTSDB
ON DEPLOYMENTS AND TESTING
(FUNCTIONAL/INTEGRATION) "Testing is HARD" - the dev that sits on your left
THE CHALLENGES THAT WE FACED:Hard to execute functional tests between servicesFrustrating to deploy semi-manually (Capistrano)Scary to deploy other teams services
SBTMotivation: Scala adaptionComplex Scala syntaxCool features: ~test, shell, consoleHard to debug
GILT-SBT-BUILDSimple config for all the servicesPulls many plugins: [nexus, testing, RPMs, run scripts, Monitoring,SemVer, ...]Custom commands (e.g 'sbt release')
object Build extends ClientServerCoreProject with Dependencies { val name = 'svc-sale-activation' val coreDeps = .... val serverDeps = ... val clientDeps = ...
override val ioncannonTrack = IonCannon.FastTrack }
ION-CANNON + SBTRun functional/Selenium tests on dedicated EnvSupports Canary releasesEasy rollbacksIntegrated health checks
MAIN TAKEAWAYSSimplicity - Do you really need it?We feel that it was the right choice for usMicroServices promise works for most casesAs of 2014 - You will need to invest in Tools!