distributed ci: scaling jenkins on mesos and marathon

39
Distributed CI: Scaling Jenkins on Mesos and Marathon Roger Ignazio – Puppet Labs, Inc. MesosCon 2015 – Seattle, WA

Upload: nguyenmien

Post on 03-Jan-2017

227 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Distributed CI: Scaling Jenkins on Mesos and Marathon

Distributed CI:Scaling Jenkins on

Mesos and Marathon

Roger Ignazio – Puppet Labs, Inc.MesosCon 2015 – Seattle, WA

Page 2: Distributed CI: Scaling Jenkins on Mesos and Marathon

Roger Ignazio

QE Automation Engineer Puppet Labs, Inc.

@rogerignazio

About Me

Page 3: Distributed CI: Scaling Jenkins on Mesos and Marathon

Mesos In Action

mesosinaction.com

Code: cftwmesos

Page 4: Distributed CI: Scaling Jenkins on Mesos and Marathon

Agenda

● Puppet Labs’ testing environment● Conventional methods for scaling Jenkins● Motivations for re-architecting CI● Breaking up the Jenkins monolith● Demo, outcomes, and future work

Page 5: Distributed CI: Scaling Jenkins on Mesos and Marathon

Audience Poll

Page 6: Distributed CI: Scaling Jenkins on Mesos and Marathon

Intro to Mesos

● General-purpose cluster manager● Represent many machines as a single entity● Advertise resources directly to applications

Page 7: Distributed CI: Scaling Jenkins on Mesos and Marathon

Intro to Marathon

● Mesos framework that provides private PaaS○ Manages long-running tasks

● Easily scale apps to N instances● Automatically restarts failed app instances

Page 8: Distributed CI: Scaling Jenkins on Mesos and Marathon

Intro to Jenkins

● Distributed, open source CI tool● Repeatable build/test of software projects● Large community, rich plugin ecosystem

Page 9: Distributed CI: Scaling Jenkins on Mesos and Marathon

Puppet Labs’Testing Environment

Page 10: Distributed CI: Scaling Jenkins on Mesos and Marathon

Testing at Puppet Labs

● 4k to 5k builds/day across 75 platforms● 15 Jenkins clusters

○ Loosely based on team, project, function○ ~ 1,300 executors across ~ 240 build machines

● Tooling: Beaker, vmpooler

Page 11: Distributed CI: Scaling Jenkins on Mesos and Marathon

Testing at Puppet Labs

● Configuration management○ Most job configs and scripts stored in Jenkins○ Infrastructure managed by Puppet

● Reporting○ Jenkins UI ( x15! )○ Clockin, Waylon

Page 12: Distributed CI: Scaling Jenkins on Mesos and Marathon

Conventional Methods for Scaling Jenkins

Page 13: Distributed CI: Scaling Jenkins on Mesos and Marathon

● Two common deployments○ Single Jenkins master with many jobs○ Master per team, project, or function

● Not highly available● Can’t load balance across masters● Static partitioning kills overall utilization

Scaling Jenkins

Page 14: Distributed CI: Scaling Jenkins on Mesos and Marathon

Scaling Jenkins

FOSS Jenkins

. . .Jenkins Slave 1

Jenkins Slave 2

. . .Jenkins Slave N

PE Jenkins

Jenkins Slave 1

Jenkins Slave 2

. . .Jenkins Slave N

Project X Jenkins

Jenkins Slave 1

Jenkins Slave 2

. . .Jenkins Slave N

90% Utilized(Normal)

140% Utilized(40 builds in queue)

0% Utilized(Idle)

Page 15: Distributed CI: Scaling Jenkins on Mesos and Marathon

Scaling Jenkins

What can we do about it?

On-demand resources!

Page 16: Distributed CI: Scaling Jenkins on Mesos and Marathon

Scaling Jenkins

FOSS Jenkins . . .

Mesos Slave 1

Mesos Slave 4

Mesos Slave N

PE Jenkins

Mesos Slave 2

Mesos Slave 5

Project X Jenkins

Mesos Slave 3

Mesos Slave 6

. . .

Mesos Master

Page 17: Distributed CI: Scaling Jenkins on Mesos and Marathon

Motivations forRe-architecting CI

Page 18: Distributed CI: Scaling Jenkins on Mesos and Marathon

User Stories

As a $role,

I { want, need } $something,

so that $outcome

Page 19: Distributed CI: Scaling Jenkins on Mesos and Marathon

User Stories

As a Developer, I want tests to be run against pull requests, so that I have confidence in the code

about to be merged

Page 20: Distributed CI: Scaling Jenkins on Mesos and Marathon

User Stories

As a Developer, I don't want to worry about the underlying infrastructure of the CI system

Page 21: Distributed CI: Scaling Jenkins on Mesos and Marathon

User Stories

As a CI consumer, I want a central location to view all CI activity, so that I don’t have to visit multiple URLs

Page 22: Distributed CI: Scaling Jenkins on Mesos and Marathon

User Stories

As a QE, I want slaves to be on-demand, so that infrastructure resources are used more efficiently

Page 23: Distributed CI: Scaling Jenkins on Mesos and Marathon

Motivation

● Reduce friction in dev workflows● Event (and data)-driven system● Improve reporting and user experience● Scale to meet growing demand

Page 24: Distributed CI: Scaling Jenkins on Mesos and Marathon

Breaking UpThe Monolith

Page 25: Distributed CI: Scaling Jenkins on Mesos and Marathon

The Jenkins MonolithJenkins Master

Web UI HTTP API

Disk

Configuration Management

Job Trigger (polling, cron)

Plugins

Scheduler Build Queue

Remoting (master / slave coordination)

Build Info and Results

Jenkins Slave(Executors)

GitHubProject Repo

Page 26: Distributed CI: Scaling Jenkins on Mesos and Marathon

The Jenkins MonolithJenkins Master

Web UI HTTP API

Disk

Configuration Management

Job Trigger (polling, cron)

Plugins

Scheduler Build Queue

Remoting (master / slave coordination)

Build Info and Results

Jenkins Slave(Executors)

GitHubProject Repo

Devs

Page 27: Distributed CI: Scaling Jenkins on Mesos and Marathon

Breaking Up The Monolith

● Job configurations● Build trigger● Build history

Page 28: Distributed CI: Scaling Jenkins on Mesos and Marathon

GitHubProject Repo

Hook Processor

Jenkins

Mesos Master

Reporting Application HTTP API

Build Info and

Results

Job Config

Breaking Up The Monolith

Mesos Slave(s)

Jenkins Slave(s)

Builds

Page 29: Distributed CI: Scaling Jenkins on Mesos and Marathon

Hook Processor

Jenkins

Mesos Master

Reporting Application HTTP API

Build Info and

Results

Mesos Slave(s)

Jenkins Slave(s)

Builds

Breaking Up The MonolithDevs

Build Info and

Results

GitHubProject RepoJob Config

Page 30: Distributed CI: Scaling Jenkins on Mesos and Marathon

Jenkins on Marathon

● Marathon as a private PaaS● Scale Jenkins masters horizontally● Deploy updates, config changes, plugins● Continuous deployment of CI ?!

Page 31: Distributed CI: Scaling Jenkins on Mesos and Marathon

Jenkins on Marathon Mesos

Marathon Marathon

Jenkins Master 1

Jenkins Master 2

Jenkins Master n

JenkinsSlave(s)

JenkinsSlave(s)

JenkinsSlave(s)

Redis + Elasticsearch, Logstash, Kibana (ELK)

Page 32: Distributed CI: Scaling Jenkins on Mesos and Marathon

GitHub Webhook Processor

Marathon

Jenkins Master

Logstash andElasticsearch

Seed Job

Dynamic Job

Dynamic Job

Webhook Listener

Redis

Job Config

Page 33: Distributed CI: Scaling Jenkins on Mesos and Marathon

Reporting

● Unique IDs for each event (in Redis)● Webhooks and build data (in Elasticsearch)● Query and visualize system activity (in Kibana)● Build our own reporting app?

Page 34: Distributed CI: Scaling Jenkins on Mesos and Marathon

Reporting

Page 35: Distributed CI: Scaling Jenkins on Mesos and Marathon

Demo

rji/mesoscon-2015-demo

Page 36: Distributed CI: Scaling Jenkins on Mesos and Marathon

Outcomesand

Future Work

Page 37: Distributed CI: Scaling Jenkins on Mesos and Marathon

Outcomes

● Single Git-based workflow● Standardized, stateless Jenkins masters● Jenkins slaves provisioned on-demand● Bonus: private PaaS (Marathon)

Page 38: Distributed CI: Scaling Jenkins on Mesos and Marathon

Future Work

● Adoption● Single reporting dashboard (with API)● Intelligent job queueing and throttling● Job DSL plugin abstraction and templates

Page 39: Distributed CI: Scaling Jenkins on Mesos and Marathon

Q & A

@rogerignazio