improving operations efficiency with puppet
TRANSCRIPT
Improving Operations Efficiency with Puppet
April 17th, 2015
Nicolas Brousse | Sr. Director Of Operations Engineering | [email protected] Julien Fabre | Site Reliability Engineer | [email protected]
Who are we?
TubeMogul ● Enterprise software company for digital branding ● Over 27 Billions Ads served in 2014 ● Over 30 Billions Ad Auctions per day ● Bid processed in less than 50 ms ● Bid served in less than 80 ms (include network round trip) ● 5 PB of monthly video traffic served ● 1.1 EB of data stored
Operations Engineering ● Ensure the smooth day to day operation of the platform
infrastructure ● Provide a cost effective and cutting edge infrastructure ● Team composed of SREs, SEs and DBAs ● Managing over 2,500 servers (virtual and physical)
Our Infrastructure
Public Cloud On Premises
Multiple locations with a mix of Public Cloud and On Premises
● Java (a lot!) ● MySQL ● Couchbase ● Vertica ● Kafka ● Storm ● Zookeeper, Exhibitor ● Hadoop, HBase, Hive ● Terracotta ● ElasticSearch, Kibana ● LogStash ● PHP, Python, Ruby, Go... ● Apache httpd ● Nagios ● Ganglia
Technology Hoarders
● Graphite ● Memcached ● Puppet ● HAproxy ● OpenStack ● Git and Gerrit ● Gor ● ActiveMQ ● OpenLDAP ● Redis ● Blackbox ● Jenkins, Sonar ● Tomcat ● Jetty (embedded) ● AWS DynamoDB, EC2, S3...
● 2008 - 2010: Use SVN, Bash scripts and custom templates.
● 2010: Managing about 250 instances. Start looking at Puppet.
● 2011: Started with Puppet 0.25 then upgraded to 2.7 by EOY on 400 servers with 2 contributors.
● 2012: 800 servers managed by Puppet. 4 contributors.
● 2013: 1,000 servers managed by Puppet. 6 contributors.
● 2014: 1,500 servers managed by Puppet. Workflow using Git, Gerrit and Jenkins. 9 contributors. Start migration to 3.7.
● 2015: 2,000 servers managed by Puppet. 13 contributors.
Five Years Of Puppet!
● 2000 nodes ● 225 unique nodes definition
● 1 puppetmaster
● 112 Puppet modules
Puppet Stats
● Virtual and Physical Servers Configuration : Master mode ● Building AWS AMI with Packer : Master mode
● Local development environment with Vagrant : Master mode
● OpenStack deployment : Masterless mode
Where and how do we use Puppet ?
Code Review?
● Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack, WikiMedia, LibreOffice, Spotify, GlusterFS, etc...
● Fine Grained Permissions Rules ● Plugged to LDAP ● Code Review per commit ● Stream Events ● Use GitBlit ● Integrated with Jenkins and Jira ● Managing about 600 Git repositories
A Powerful Gerrit Integration
Gerrit in Action
● 1 job per module ● 1 job for the manifests and hiera data ● 1 job for the Puppet fileserver ● 1 job to deploy
Continuous Delivery with Jenkins
Global Jenkins stats for the past year ● ~10,000 Puppet deployment ● Over 8,500 Production App Deployment
Team Awareness: HipChat Integration with Hubot
Infrastructure As Code ● Follow standard development lifecycle ● Repeatable and consistent server
provisioning Continuous Delivery ● Iterate quickly ● Automated code review to improve code
quality Reliability ● Improve Production Stability ● Enforce Better Security Practices
Puppet Continuous Delivery Workflow: The Vision
The Workflow
The Workflow : Puppet code logic
Puppet environments ● Dedicated node manifests (*.pp) ● Modules deployed by branch with Git submodules
All the data in Hiera ● Try to avoid params.pp class ● Store everything : modules parameters, classes, keys, passwords, ...
Puppet Code Hierarchy
/etc/puppet ├── puppet.conf, hiera.yaml, *.conf ├── hiera └── environments ├── dev │ ├── manifests │ │ ├── nodes/*.pp │ │ └── site.pp │ └── modules │ ├── activemq │ ├── apache │ ├── apf │ ... │ └── zookeeper └── production ├── manifests │ ├── nodes/*.pp │ └── site.pp └── modules ├── activemq … └── zookeeper
Git submodules, branch dev
Git submodules, branch production
Hiera Configuration
$ cat /etc/puppet/hiera.yaml --- :backends: - eyaml - yaml :yaml: :datadir: /etc/puppet/hiera :eyaml: :datadir: /etc/puppet/hiera :extension: 'yaml' :pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem :pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem :hierarchy: - fqdn/%{::fqdn} - "%{::zone}/%{::vpc}/%{::hostgroup}" - "%{::zone}/%{::vpc}/all" - "%{::zone}/%{::hostgroup}" - "%{::zone}/all" - hostname/%{::hostname} - hostgroup/%{::hostgroup} - environment/%{::environment} - common :merge_behavior: deeper
Hiera eyaml : github.com/TomPoulton/hiera-eyaml ● Hiera backend ● Easy to use ● Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml
Encrypt Your Secrets
$ cat secret.yaml --- ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMII IBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s+Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e+qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]
Encrypt Files
Blackbox : github.com/StackExchange/blackbox ● Use GPG to encrypt secret files ● Easy to add/delete team members ● No need to change your Puppet code !
# modules/${modules_name}/files/credentials.yaml.gpg file { ‘/etc/app/credentials.yaml’: ensure => ‘file’, owner => ‘root’, group => ‘root’, mode => ‘0644’, source => ‘puppet:///modules/${module_name}/credentials.yaml’ }
The Workflow
The Workflow : bottlenecks
● Only Ops team members can commit (SRE, SE)
● Review and validation is done only by a SRE
● Jenkins will verify the code but will not validate the commit
● Static Puppet environments
● Rely a lot on server hostnames
Flexibility : R10K github.com/adrienthebo/r10k ! ● Dynamic environments
● No Git submodules anymore ! : - )
● Easy to reproduce any environment
● Can use private and forge Puppet modules
● Can use branches and tags
● Based on Puppetfile
Puppet Workflow Reloaded!
R10K
$ cat Puppetfile forge "https://forgeapi.puppetlabs.com" # Forge modules mod 'pdxcat/collectd' mod 'puppetlabs/rabbitmq' mod 'arioch/redis' mod 'maestrodev/wget' mod 'puppetlabs/apt' mod 'puppetlabs/stdlib' # Tubemogul modules mod "hosts", :git => 'ssh://<gerrit_host>/puppet/modules/hosts', :branch => 'dev' mod "timezone", :git => 'ssh://<gerrit_host>/puppet/modules/timezone', :branch => 'dev' ...
Puppet Workflow Reloaded!
Better code organization : Roles and Profiles ● Represent the business logic : Roles
o Highest abstraction layer o Use Profiles for implementation
● Implement the applications : Profiles
o Remove potential code duplication o Use modules and other Puppet resources
Roles/Profiles Pattern
class role::logs { include profile::base include profile::logstash::server include profile::elasticsearch } class profile::logstash { $version = hiera('profile::logstash::server::version', '1.4.2') $es_host = hiera('profile::logstash::server::es_host', 'es01') $redis_host = hiera('profile::logstash::server::redis_host', 'redis01') class { 'logstash': package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb", java_install => true, } logstash::configfile { 'input_redis': content => template('logstash/configfile/logstash.input_redis.conf.erb'), order => 10, } logstash::configfile { 'output_es': content => template('logstash/configfile/logstash.output_es.conf.erb'), order => 30, } }
Do not rely on hostname : nodeless approach ● Facts to guide Puppet ● No node myawesomeserver { } anymore ● Enforce a cluster vision ● site.pp gives the configuration logic
Puppet Workflow Reloaded!
# /etc/puppet/manifests/site.pp node default { if $::ec2_tag_tm_role { notify { "Using role : ${ec2_tag_tm_role}": } include "role::${::ec2_tag_tm_role}" } else { fail(‘No role found. Nothing to configure.’) } }
● Specify tags during the provisioning ● Retrieve tags with AWS Ruby SDK and create facts
● New hierarchy
AWS EC2 tags
$ facter -p | grep ec2_tag ec2_tag_cluster => rtb-bidder ec2_tag_nagios_host => mgmt01 ec2_tag_name => bidder ec2_tag_pupenv => production ec2_tag_tm_role => rtb::bidder
:hierarchy: - "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}" - "%{::zone}/%{::ec2_tag_vpc}/all" - "%{::zone}/all" - vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster} - vpc/%{::ec2_tag_vpc}/all - environment/%{::environment} - common
New merging and reviewing rules
● Everyone can commit a Puppet code ● Allow everyone to review a Puppet change (+1)
● Allow SE and SRE to validate a Puppet change (+2)
● Auto validation/merging in dev if at least 80% of test (+2)
Next improvements
● Acceptance testing with Beaker and Docker ● Full test provisioning with ServerSpec
● PuppetDB to improve the reporting
● Dedicated Puppet Masters
OpenSource Modules
● tubemogul-aptly ● tubemogul-blackbox ● tubemogul-codedeploy ● tubemogul-gor ● tubemogul-packer ● tubemogul-tmfile ● tubemogul-storm ● tubemogul-kafka
Nicolas Brousse Julien Fabre
@orieg @julien_fabre