chef at etsy

54
Chef at Etsy

Upload: jonlives

Post on 28-Nov-2014

1.665 views

Category:

Technology


0 download

DESCRIPTION

Slides from my "Chef at Etsy" talk at the London Chef Meetup on Thurs Oct 10th, 2014

TRANSCRIPT

Page 1: Chef at Etsy

Chef at Etsy

Page 2: Chef at Etsy

@jonlives

Jon Cowie

Sr Operations Engineer

Page 3: Chef at Etsy
Page 4: Chef at Etsy

30 Million Members

4

1 Million Active Shops

Page 5: Chef at Etsy

20 Million Items Listed

5

60 Million Monthly Unique Visitors

Page 6: Chef at Etsy

@jonlives

We Love Chef!

Page 7: Chef at Etsy

@jonlives

Absorb what is useful.

Discard what is useless.

Page 8: Chef at Etsy

@jonlives

“I am not smart enough to build an ontology … that

can encompass all the variations in infrastructure.

Nobody is, the world moves too fast.”

Page 9: Chef at Etsy

@jonlives

There is no magic pill.

Page 10: Chef at Etsy

@jonlives

You are the expert.

Page 11: Chef at Etsy

@jonlives

Chef at Etsy

• Chef Server 11.1.4

• ~2000 Nodes

• CentOS, some Mac OS X

Page 12: Chef at Etsy

@jonlives

Beginning of 2010 Today

Page 13: Chef at Etsy

@jonlives

Chef at Etsy

Page 14: Chef at Etsy

@jonlives

Evolution of Chef

Page 15: Chef at Etsy

@jonlives

2010: The Beginning

• ~250 Nodes (Ubuntu & CentOS

• The first cookbooks

• Out of the box workflow

Page 16: Chef at Etsy

@jonlives

2011: Growth

• ~400 Nodes (CentOS)

• Chef still pretty specialised knowledge

• Handlers added

Page 17: Chef at Etsy

@jonlives

2012: A big year

• ~800 Nodes (CentOS & MacOS X) • More in-house Chef expertise • Workflow tooling • Debugging tooling • Monitoring

Page 18: Chef at Etsy

@jonlives

2013: Chef at Etsy

• ~1500 Nodes • Workflow tooling enhancements • Feature flags in Chef • Chef performance - Chef 11 upgrade

Page 19: Chef at Etsy

@jonlives

2014: Chef at Etsy

• ~2000 nodes • Consolidation • CI with Chef • Omnibus • Work-in-Progress tooling

Page 20: Chef at Etsy

@jonlives

Patterns & Workflows

Page 21: Chef at Etsy

@jonlives

Cookbook Workflow

Page 22: Chef at Etsy

@jonlives

$> review -r jcowie --cc ops

Page 23: Chef at Etsy

@jonlives

knife-spork

• https://github.com/jonlives/knife-spork • Workflow tool • Helps multiple chefs avoid clashing • Visibility into changes • Plugins

Page 24: Chef at Etsy

@jonlives

knife-spork

• knife spork bump • knife spork upload • Test change

Page 25: Chef at Etsy

@jonlives

Test Change

• https://github.com/jonlives/knife-flip

• knife node flip foo.etsy.com testing

• knife role flip MyRole testing

Page 26: Chef at Etsy

@jonlives

Test Change

• https://github.com/mrtazz/knife-wip • Uses node tags <irccat> CHEF: bburry started work cent7 package bugfixing on deploy01.ny5.etsy.com

Page 27: Chef at Etsy

@jonlives

knife-spork

• knife spork bump • knife spork upload • Test change • knife spork promote --remote • git commit and push

Page 28: Chef at Etsy

@jonlives

Monitoring & Debugging

Page 29: Chef at Etsy

@jonlives

knife-spork & CI Job

<irccat> CHEF: Jon Cowie uploaded [email protected] <irccat> CHEF: Jon Cowie promoted [email protected] to production <snip> <irccat> Git PUSH -> Sysops/chef <snip> <Jenkins> Starting build #5649 for job chef-server-git-sync <Jenkins> Project chef-server-git-sync build #5649: SUCCESS in 2 min 36 sec: http://ci.etsycorp.com/job/chef-server-git-sync/5649/

Page 30: Chef at Etsy

@jonlives

IRC Handler<irccat> Chef run failed on officebackup01.office.etsy.com gist failed, see /var/log/chef/client.log on the host !

<irccat> Still Failing on dbnest01.ny4.etsy.com since 2 days ago https://github.etsycorp.com/gist/656d8914fbef5a6bd9aa

Page 31: Chef at Etsy

@jonlives

Lastrun Data

• https://github.com/jgoulah/knife-lastrun

• knife node lastrun foo.bar.com

Page 32: Chef at Etsy

@jonlives

Lastrun Data%  knife  node  lastrun  dbnest01.ny4.etsy.com  Status                  failed                                        Elapsed  Time          29.055892                                  Start  Time              2014-­‐10-­‐06  12:54:51  +0000  End  Time                  2014-­‐10-­‐06  12:55:20  +0000  !<snip>  !Exception  <snip>  Installed  package  backupd-­‐1.4-­‐1.365657d.el5.centos  is  newer  than  candidate  package  backupd-­‐1.2-­‐1.99ddb8e.el5  

Page 33: Chef at Etsy

@jonlives

Dashboards

Page 34: Chef at Etsy

@jonlives

Dashboards

Page 35: Chef at Etsy

@jonlives

Dashboards

Page 36: Chef at Etsy

@jonlives

Monitoring & Debugging

• https://github.com/etsy/chef-handlers • https://github.com/etsy/dashboard • https://github.com/jgoulah/knife-lastrun • https://github.com/bmarini/knife-inspect

Page 37: Chef at Etsy

@jonlives

Feature Flags

Page 38: Chef at Etsy

@jonlives

Downsides of Existing Approach

• Holding cookbook in testing is blocking • Accidental promotions • Testing env affects all cookbooks • “Upgrade” envs often used • How to make it more “Etsy”?

Page 39: Chef at Etsy

@jonlives

Page 40: Chef at Etsy

@jonlives

chef-whitelist

• https://github.com/etsy/chef-whitelist • Databag driven • Cookbook library • Feature flags!

Page 41: Chef at Etsy

@jonlives

chef-whitelist{ "id": "php-5-5-17", "patterns": [ "statsd*.ny5.etsy.com", "deploy*.ny5.etsy.com", <snip> ] }

Page 42: Chef at Etsy

@jonlives

chef-whitelist

if node.is_in_whitelist? "php-5-5-17" package "php-pecl-opcache" do action :remove end end

Page 43: Chef at Etsy

@jonlives

Configuration Data

Page 44: Chef at Etsy

@jonlives

Keep cookbooks:• Simple • Modular • Scalable • Maintainable

Page 45: Chef at Etsy

@jonlives

Environments

• Cookbook version constraints

Page 46: Chef at Etsy

@jonlives

Roles

• Group-level config • Syslog-ng • Iptables • Sudoers

Page 47: Chef at Etsy

@jonlives

Roles - iptables“firewall": { "ports": { "11211": { "subnet_group": "prod_subnets" }, <snip> } }

Page 48: Chef at Etsy

@jonlives

Roles - Syslog-ng"syslog": {

"web": {

"web_apache_access_log": {

"source": "/var/log/httpd/access_log",

"source_program_override": "APACHEACCESS: ",

"destination": "/data/syslog/current/web/access.log",

"destination_filters": [

"host('^(web0|dlweb)')",

"match('APACHEACCESS')"

]

}

}

Page 49: Chef at Etsy

@jonlives

Data Bags

• Global / Datacenter specific Config • Ganglia • Cobbler • VOIP

• Data Storage

Page 50: Chef at Etsy

@jonlives

Data Bags - Ganglia{

"id": "config_se5",

"grid_name": "EtsySE5",

"authority": "http://gangliase5.etsycorp.com",

"trusted_hosts": <snip>,

"groups": {

"Utilities": "239.2.11.71",

<snip>

}

<snip>

}

Page 51: Chef at Etsy

@jonlives

Data Bags - Cobbler{

"id": "config_corp",

"cobbler_server": "corpking02.corp.etsy.com",

"dns_servers": [ “10.x.x.x", “10.x.x.x" ],

"dhcp_ranges": {

"10.100.x.0": {

"routers": "10.x.x.1",

"mask": "255.255.255.0",

"range": "10.x.x.11 10.x.x.250"

}

}

}

Page 52: Chef at Etsy

@jonlives

Write cookbooks you’ll thank yourself for.

Page 53: Chef at Etsy

@jonlives

!

http://jonliv.es/book !

Discount Code: AUTHD !

40% off Print 50% off Digital

Page 54: Chef at Etsy

@jonlives

Thanks! Questions?

!

@jonlives / http://jonliv.es / [email protected]