openstack summit tokyo 2015 - building a private cloud to efficiently handle 40 billion requests per...

Building a Private Cloud to Efficiently Handle 40 Billion Requests / Day

October 28th, 2015

Pierre Gohon | Sr. Site Reliability Engineer | pierre.gohon@tubemogul.comPierre Grandin | Sr. Site Reliability Engineer | pierre.grandin@tubemogul.com

Who are we?

TubeMogul (Nasdaq : TUBE)● Enterprise software company for digital branding● Over 27 Billion Ads served in 2014● Over 40 Billion Ad Auctions per day in Q3 2015● Bids processed in less than 50 ms● Bids served in less than 80 ms (inc. network round trip)● 5 PB of monthly video traffic served● 1.6 EB of data stored

Who are we?

Operations Engineering● Ensure the smooth day to day operation of the platform

infrastructure● Provide a cost effective and cutting edge infrastructure● Provide support to dev teams ● Team composed of SREs, SEs and DBAs (US and UA)● Managing over 2,500 servers (virtual and physical)

Our Infrastructure

Public Cloud On Premises

Multiple locations with a mix of Public Cloud and On Premises

● 6 AWS Regions (us-east*2, us-west*2, europe, apac)● Physical servers in Michigan / Arizona (Web/Databases)● DNS served by third party (UltraDNS +Dynect)● External monitoring using Catchpoint● CDNs to deliver content● External security audits

We’re not adding complexity!

Before Openstack: we’re already very “Hybrid”…

● Own your infrastructure stack● Physical proximity matters (reduced/controlled latency)● Better infrastructure planning● Technological transparency

● … $$ !

Project timeline

Where do we stand?

● DIY ?○ Small OPS team

■ 12 members in two timezones■ 3 only dedicated to OpenStack

○ New challenges■ Internal training■ Little external support (really ?) vs AWS■ Manage data centers (Servers, Network, …)

OpenStack challenges - Operational aspect

● Are applications AWS dependent ?○ Internal ops tools○ Developer’s applications○ AWS S3, DynamoDB, SNS, SQS, SES, SWF

● Convert developers to the project : we need their support● OpenStack release cycle (when shall we update to latest

version?)● OpenStack really needed components ?● How far do we go (S3 replacement ? Network control ?

Hardware control ?)

OpenStack challenges - Application migration aspect

● Managing our own ASN / IPs (v4/v6)● Choose “best for needs” transit providers (tier 1)● Better control routes to/from our endpoints● Allow dedicated AWS connections / others ● Allow direct peerings to ad networks● Want to be accountable for networking issues● Cost control

How? Networking - External connectivity

● Applications are already designed for redundancy/cloud● Circumvent virtualized networking limitations● Fine-tune baremetal nodes for HAProxy ● For the future equipments are “cloud ready” (nexus 5K for

top of rack switch)○ automatic switch configuration○ cisco software evolutions ?

● 1G for admin, X*10G for public ?● Leverage multicast ?

How? Networking - Hybrid physical / virtualized

Network node Compute node Load balancer

public network

private networkusing VLANs

How? Networking - RTT

● Latency from our DC to AWS is 6ms average in US-WESTrtb-bidder01(rtb):~$ mtr -r -c 50 gw01.us-west-1a.publicHOST: rtb-bidder01 Loss% Snt Last Avg Best Wrst StDev 1.|-- 10.0.4.1 0.0% 50 0.2 0.2 0.1 0.3 0.0 2.|-- XXX.XXX.XXX.XXX 0.0% 50 0.2 0.3 0.2 2.6 0.3 3.|-- ae-43.r02.snjsca04.us.bb. 0.0% 50 1.4 1.5 1.2 2.3 0.2 4.|-- ae-4.r06.plalca01.us.bb.g 0.0% 50 2.0 2.1 1.8 3.4 0.3 5.|-- ae-1.amazon.plalca01.us.b 0.0% 50 39.2 3.5 1.5 39.2 5.6 6.|-- 205.251.229.40 0.0% 50 3.5 2.8 2.2 4.9 0.6 7.|-- 205.251.230.120 0.0% 50 2.1 2.3 2.0 8.5 0.9 8.|-- ??? 100.0 50 0.0 0.0 0.0 0.0 0.0 9.|-- ??? 100.0 50 0.0 0.0 0.0 0.0 0.0 10.|-- ??? 100.0 50 0.0 0.0 0.0 0.0 0.0 11.|-- 216.182.237.133 0.0% 50 4.0 6.0 2.7 20.2 5.2

● If you are not building a multi-thousand hypervisors cloud, you don’t need it to be complex

● Simplifies day-to-day operations● Home made puppet catalog

○ because less lines of code○ because of the learning curve○ because need to tweak settings (ulimit?)

● No need for horizon● No need for shared storage

How? Keep it simple

● Affinity / anti-affinity rules○ Enforce resiliency using anti-affinity rules

○ Improve performances using affinity rules

How? Leverage your knowledge of your infrastructure

{"profile": "OpenStack", "cluster": "rtb-hbase", "hostname": "rtb-hbase-region01", "nagios_host": "mgmt01"}

How?Treat your infrastructure as any other

engineering project

Infrastructure As Code● Follow standard development lifecycle● Repeatable and consistent server

provisioning

Continuous Delivery● Iterate quickly● Automated code review to improve code

quality

ReliabilityImprove Production StabilityEnforce Better Security Practices

How? Continuous Delivery

● We already have a lot of automation:● ~10,000 Puppet deployments last year● Over 8,500 production deployments via jenkins last year

● On the infrastructure:○ masterless mode for the deployment○ master mode once the node is up and running

● On the VMs:○ Puppet run is triggered by cloud-init, directly at boot○ from boot to production ready: <5 minutes

Puppet

see also : http://www.slideshare.net/NicolasBrousse/puppet-camp-paris-2015

Infrastructure As Code - Code Review

Gerrit, an industry standard : OpenStack, Eclipse, Google, Chromium, WikiMedia, LibreOffice, Spotify, GlusterFS, etc...

Fine Grained Permissions RulesPlugged into LDAPCode Review per commitStream EventsIntegrated with Jenkins, Jira and HipchatManaging about 600 Git repositories

Infrastructure As Code - Gerrit Integration

Infrastructure As Code - Gerrit in Action

Automatic verify : -1 if the commit doesn’t pass Jenkins code validation

Infrastructure As Code - The Workflow

Lab / QA

Prod cluster

Infrastructure As Code - Continuous Delivery with Jenkins

Infrastructure As Code - Team Awareness

Infrastructure As Code - Safe upgrade paths

Easy as 1-2-3:1. Test your upgrades using Jenkins2. Deploy the upgrade by pressing a

single button*3. Enjoy the rest of your day

* https://github.com/pgrandin/lcamfig.1 : N. Brousse, Sr. Director of Operation Engineering, switching our production workload to OpenStack

Get ready for production :Monitor everything

Monitor as much as you can ?

● Existing monitoring (Nagios, Graphite) still in use● Specific checks for OpenStack

○ check component API : performance / availability / operability

○ check resources : ports, failed instances● Monitoring capacity metrics for all hardware● SNMP traps for network equipment● Monitoring is just an extension of our existing

monitoring in AWS

Monitoring auto-discovery

● New OpenStack node is automatically monitored○ automatically / upon request○ nagios detects new hosts (API query)○ nagios applies component related check by role○ graphing is also automatically updated

Centralized monitoring

Monitoring is graphing

A look in the rearview mirror

Benefits - Transparency / visibility

Discover new odd/unexpected traffic/activity patterns

Benefits - Tailored Instances

Before After

m3.xlarge + 2GB RAM? m3.2xlarge!

# nova flavor-create rtb.collector rtb.collector 17408 8 2

Benefits - Operational Transparency

OpenStack

# cerveza -m noc -- --zone tm-sjc-1a --start demo01

# cerveza -m noc -- --zone us-east-1a --start demo01

Benefits - Efficiency

Before After

Benefits - Efficiency

1+ million rx packets/s on only 2 Haproxy Load Balancers, full SSL

What does not fit?

Downscaling does not really make sense for uscpus are online and paid for, we should use them

Upscaling has its limits : AWS is refreshing instance types every year …

Sometime a small feature added can have huge load impact.

It makes sense to keep the elastic workloads (machine learning, ...) in AWS

● We can be “double hybrids” (aws + openstack + haproxy bare metal)

● Dev environment is needed for Openstack (new versions / break things)

● Storage is still a big issue due to our volume (1.6 EB)● Some stuff may stay “forever” on AWS ?● More dev/ops communication● OpenStack is flexible● No need for HA everywhere● Spikes can be offloaded on AWS

(cloud bursting)

What we’ve learnt

Still a lot left to do

Technical aspectNeed to migrate other AWS RegionsGain more experienceVersion upgradesContinue to adapt our toolingAdd more alarms for capacity issuesDifferent Regions, different issues ?

Human aspectDev team still thinks in the AWS world

( and sometimes OPS too…)

- Ad serving in production since 2015-05- Bidding traffic in production since 2015-09- 100% uptime since pre-production (2015-03)

Cost of operation for our current production workload:- Reduced by a factor of two, including OpEx cost!

Aftermath

Questions?

Pierre GohonPierre Grandin

@pierregohon@p_grandin

openstack summit tokyo 2015 - building a private cloud to efficiently handle 40 billion requests per...

Engineering

openstack installation (icehouse) openstack

openstack deployment and operations guide...

vmware integrated openstack installation and … integrated...

cloud networking (vitmma02) network virtualization...

openstack & magnum openstack & containers - … build...

red hat openstack platform 16 · scripts also perform...

devstack: learn openstack by running openstack

new world of openstack restful api•openstack •openstack...

openstack networking design - nanog...(global openstack...

openstack foundation - certified openstack administrator

openstack meetup: nfv and openstack

Четырехлетие openstack - Сложный...

openstack operations guide - mit...

Четырехлетие openstack - mirantis openstack...

cloud networking (vitmma02) network virtualization...

austin openstack meetup: chef and openstack

|openstack | - introduction to openstack cloud|

spectrum scale openstack integration -...

manchester openstack meetup: i have an openstack cloud, now...

openstack operations guide - openstack docs: current