building a massively scalable cloud service from the grounds up
Post on 21-Oct-2014
561 views
DESCRIPTION
Serving developer binaries isn’t trivial. Such binaries are consumed by tools ,and create massive request load. Add to that support for metadata, REST API, storage quotas, stats, repo indexes on demand and global HA distribution, and you’ve got yourself a pretty complicated system to run and manage. This talk will show you how Bintray, JFrog’s social binary distribution service, works. We will speak about how the system segmentation supports massive loads across data centers with stateless vertical scaling; how Grails applications scale and how we tie up different NoSQL technologies such as CouchDB, MongoDB, ElasticSearch & Redis; how we chose between physical and virtual servers and how we manage deployments without service interruption.TRANSCRIPT
Building a Massively Scalable Cloud Service
from the Grounds Up
Yoav Landman
@yoavlandman
github.com/yoav
cee tee oh @ JFrog
What Frog?
What Frog?
What Frog?
What Frog?
So…
Some Numbers ___________ liftoff + 5 months
Some Numbers ___________ liftoff + 5 months
Users 7K
Some Numbers ___________ liftoff + 5 months
Users 7K
Packages 70K
Some Numbers ___________ liftoff + 5 months
Users 7K
Packages 70K
Requests 1.2 B/Month
Requirements ___________
Requirements
– Download binaries
___________
Requirements
– Download binaries – Web Front
___________
Requirements
– Download binaries – Web FRONT – REST API
___________
Requirements
– Download binaries – Web FRONt – REST API – Backend services
___________
We know developers
%new_sexy_lang% community
Not our fault! AWS failed again!
Downloads must…
Web application must…
Backend Services must…
Choose your battles...
Non-Func. Requirements _________________
Non-Func. Requirements
Requirement RPS Availability
_________________
Non-Func. Requirements
Requirement RPS Availability
Download 10K Always
_________________
Non-Func. Requirements
Requirement RPS Availability
Download 10K Always
Interaction 200 Almost always
_________________
Non-Func. Requirements
Requirement RPS Availability
Download 10K Always
Interaction 200 Almost always
Services 10 Most of the time
_________________
Download Server
No Servlets here
Deduplication by Checksum
File A: 46b34
File B: a64ff7
/user-‐a/repo-‐z/package-‐y/file-‐x
/org-‐c/repo-‐m/package-‐n/file-‐k
/user-‐m/repo-‐w/package-‐t/file-‐f
Flat blobs storage
File A: 46b34
File B: a64ff7
Mapping
/user-‐m/repo-‐w/package-‐t/file-‐f
Web Front
Web Front
Web Framework
Requirements ___________
Requirements
– Rapid Application Development
___________
Requirements
– Rapid Application Development – Flexible schema
___________
Requirements
– Rapid Application Development – Flexible schema – Java Background
___________
Requirements
– Rapid Application Development – Flexible schema – Java Background – Stateless
___________
Why don’t you just use...?
Framework Why not?
________________
Why don’t you just use...?
Framework Why not?
Angular.js Ember.js æж.js Maturity
________________
-
Why don’t you just use...?
Framework Why not?
Angular.js Ember.js æж.js Maturity
Wicket State
________________
-
Why don’t you just use...?
Framework Why not?
Angular.js Ember.js æж.js Maturity
Wicket State
JSF Model
________________
-
Why don’t you just use...?
Framework Why not?
Angular.js Ember.js æж.js Maturity
Wicket State
JSF Model
Non-java No java bg
________________
-
Updated Grails to newer minor
Web Front
Data Model
Remember?
Grails means Gorm!
Gorm MongoDB plugin
Web Front
Search
Search
2 types of search Full Text Search Structured Search
2 types of search Full Text Search Structured Search
Executive summary
Framework Why not?
________________
Executive summary
Framework Why not?
Lucene/compass
Only embedded, resource guzzler
________________
Executive summary
Framework Why not?
Lucene/compass
Only embedded, resource guzzler
solr Bad grails integration
________________
Executive summary
Framework Why not?
Lucene/compass
Only embedded, resource guzzler
solr Bad grails integration
sphynx No incremental index
________________
vs.
vs.
You ask
ElasticSearch answers
Additional Services
Additional Services
Indexes, Statistics, Logs
Also, Redis to the resque
Did they just add a 4th nosql?!
Additional Services
Documentation
DevOps
IaaS vs. SaaS
Leave it to the Pros
SaaS for Download Service
Component SaaS
_________________
SaaS for Download Service
Component SaaS
blob storage SL objectstore
_________________
SaaS for Download Service
Component SaaS
blob storage SL objectstore
mapping Cloudant
_________________
SaaS for Web and services
Component SaaS
_________________
SaaS for Web and services
Component SaaS
Model Mongohq
_________________
SaaS for Web and services
Component SaaS
Model Mongohq
Grails N/A
_________________
SaaS for Web and services
Component SaaS
Model Mongohq
Grails N/A
ElasticSearch N/A
_________________
SaaS for Web and services
Component SaaS
Model Mongohq
Grails N/A
ElasticSearch N/A
Redis N/A
_________________
Physical vs. Virtual
Remember this?
Virtualization __________
Virtualization __________ Pros
Virtualization __________ Pros – Cheap
Virtualization __________ Pros – Cheap
– elastic
Virtualization __________ Pros – Cheap
– elastic – Volatile
Virtualization __________ Pros – Cheap
– elastic – Volatile cons
Virtualization __________ Pros – Cheap
– elastic – Volatile cons – Overhead
Virtualization __________ Pros – Cheap
– elastic – Volatile cons – Overhead – Tenant, not owner
Development Environment
Remember?
We are liberal
We are liberal
We are liberal
We are liberal
The Solution
The Solution
The Solution
The Solution
Chef What?
Opscode Chef
Opscode Chef
Opscode Chef
Opscode Chef
The Solution
The Solution
Vagrant Who?
Vagrant
Vagrant
Vagrant
Vagrant
Vagrant
Development
Development
Development
Development
Ops are part of the DevOps
1. Vagrant boots centos on virtualbox
1. Vagrant boots centos on virtualbox
2. Chef installs all db and service rpms from private YUM repo
1. Vagrant boots centos on virtualbox
2. Chef installs all db and service rpms from private YUM repo
3. Profit!
High Availability
(And Locality)
Cluster everything
Remember?
CDN for Download Server
GTD for Web Application
Backup
(and Vendor Lock-Out)
Snapshots and replicas
Monitoring
(Servers, State and Logs)
Prevent this:
Going to Production…
Remember?
The Solution
All together now
Conclusions time ______________
Conclusions time
– Define Criticality
______________
Conclusions time
– Define Criticality – Embrace the change
______________
Conclusions time
– Define Criticality – Embrace the change – Plan for scale, but be realistic
______________
Conclusions time
– Define Criticality – Embrace the change – Plan for scale, but be realistic
– Backup everything!
______________
No, thank you!