rubygems - behind the gems

45
RUBYGEMS ...behind the gems Roland Moriz ~ Moriz GmbH Ruby User Group München 26.10.2010 http://moriz.de /

Upload: roland-m

Post on 15-May-2015

6.374 views

Category:

Technology


2 download

DESCRIPTION

My talk about the current state of the rubygems infrastructure, problems, possible solutions. The intention behind this talk is to make people care about a problem and join forces to fix them. It's not about blaming anyone who spents her/his time for doing open source work!

TRANSCRIPT

Page 1: Rubygems - behind the gems

RUBYGEMS...behind the gems

Roland Moriz ~ Moriz GmbH

Ruby User Group München 26.10.2010

http://moriz.de/

Page 2: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

Hello blaaa bla Moriz GmbH bla bla Software Development Services bla bla bla bla Consulting bla blaaaa bla blaaaaaa bla Infrastructure Services bla bla Roland bla bla bla bla professional software development since 1999 bla bla Amazon Marketplace Deutschland bla bla bla Tiscali Games bla bla FIFA WM 2006 bla Yahoo.de bla bla bla bla two billion pageviews bla bla blala Allianz24.de/Allsecur.de bla bla bla Ruby User Group München bla bla blabla http://moriz.de/ bla blaaaba http://rails.io bla http://boot.io blablabla recently hetzner-api gem bla bla bla and the slides will be available @ http://moriz.de/talks/rubygems.

;-)

Page 3: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

RUBYGEMS MOVING PARTS

rubygems / cli gemcutter

gem

code / library, app, data, meta

Page 4: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

RUBYGEMS MOVING PARTS

rubygems / cli gemcutter

$ gem

require ”rubygems“

http://rubygems.org/

(and extensions to the rubygems client)

distributioncreation, download, setup,usage (index building, server)

Page 5: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

RUBYGEMS FACTS

• used by nearly every ruby project• the core of the ruby ecosystem• standard lib (with MRI 1.9.x)• 17.000+ gem projects• 81.000+ gem files• 23 GB+

Page 6: Rubygems - behind the gems

http://moriz.de/

started at RubyConf 2003 by:

• Rich Kilmer• Chad Fowler• David Black• Paul Brannan• Jim Weirch

> http://rubyforge.org/projects/rubygems/

Rubygems behind the gems.

RUBYGEMS FACTS

Page 7: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

GEM FACTS

• described by a .gemspec• gem build my.gemspec

easier ways:

• bundler, jewler, newgem(?), ...

Page 8: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

GEM FACTS

tar xvf rails-3.0.1.gem x data.tar.gz x metadata.gz

contents:

Page 9: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

GEM FACTS

metadata.gz > gzipped YAML

data.tar.gz > payload

Page 10: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

GEMCUTTER FACTS• started in April 2009• is now rubygems.org (rubygems 1.3.6+)• replaced rubyforge• manages uploads & downloads• rails app using PostgreSQL +

rack middleware with sinatra• by Nick Quaranto (@qrush) of Thoughtbot

> http://github.com/rubygems/gemcutter

Page 11: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

BIG PICTURE: UPLOAD RELEASE

$ gem release hetzner-api.gemspec

Successfully built RubyGem Name: hetzner-api Version: 1.0.0 File: hetzner-api-1.0.0.gem

Pushing gem to RubyGems.org...

gem release

Page 12: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

BIG PICTURE: UPLOAD RELEASE

cli

rubygems.org (gemcutter)

Page 13: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

BIG PICTURE: UPLOAD RELEASE

cli

rubygems.org

AWS S3

gem file

Page 14: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

BIG PICTURE: UPLOAD RELEASE

cli

rubygems.org

AWS S3 update specs

> database> spec files to s3

spec files

Page 15: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

BIG PICTURE: UPLOAD RELEASE

cli

rubygems.org

AWS S3 update specs

webhooks, rss, ...

http://rubygems.org/pages/api_docs

Page 16: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

BIG PICTURE: DOWNLOAD

cli

rubygems.org AWS S3

Page 17: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...

Page 18: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“no specific version => latest$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...

Page 19: Rubygems - behind the gems

$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“Gem.marshal_version=> Marshal::MAJOR_VERSION Marshal::MINOR_VERSION

Page 20: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“

irb(main):001:0> x = {}=> {}

irb(main):002:0> x['farbe'] = 'ananasblau'=> "ananasblau"

irb(main):003:0> Marshal.dump x=> "\004\b{\006\"\nfarbe\"\017ananasblau"

etc.

> http://ruby-doc.org/core/classes/Marshal.html

Page 21: Rubygems - behind the gems

$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“

Page 22: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“latest_specs:

lists the latest release number of all gems (~150 KB / 570 KB)

specs:

list of all gem releases (380 KB / 2.2 MB)

latest_specs = Marshal.load open 'latest_specs.4.8'latest_specs.size=> 17501

specs = Marshal.load open 'specs.4.8'; specs.size=> 83490

(there‘s also a pre-release spec (remember „gem install rails --pre“) and others: see rubygems source lib/rubygems/commands/generate_index_command.rb)

Page 23: Rubygems - behind the gems

$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“xload + parse spec

dependencies

Page 24: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“

Gem::Specification.new do |s|  s.authors = ["David Heinemeier Hansson"]  s.date = Time.utc(2010, 10, 14)  s.dependencies = [Gem::Dependency.new("activesupport",    Gem::Requirement.new(["= 3.0.1"]),    :runtime),   Gem::Dependency.new("actionpack",    Gem::Requirement.new(["= 3.0.1"]),    :runtime),   Gem::Dependency.new("activerecord",    Gem::Requirement.new(["= 3.0.1"]),    :runtime),   Gem::Dependency.new("activeresource",    Gem::Requirement.new(["= 3.0.1"]),    :runtime),   Gem::Dependency.new("actionmailer",    Gem::Requirement.new(["= 3.0.1"]),    :runtime),   Gem::Dependency.new("railties",    Gem::Requirement.new(["= 3.0.1"]),    :runtime),   Gem::Dependency.new("bundler",    Gem::Requirement.new(["~> 1.0.0"]),    :runtime)]  s.description = "Ruby on Rails is a full-stack web framework optimized for programmer happiness and sustainable productivity. It encourages beautiful code by favoring convention over configuration."  s.email = "[email protected]"  s.homepage = "http://www.rubyonrails.org"  s.name = "rails"  s.require_paths = ["lib"]  s.required_ruby_version = Gem::Requirement.new([">= 1.8.7"])  s.required_rubygems_version = Gem::Requirement.new([">= 1.3.6"])  s.rubyforge_project = "rails"  s.rubygems_version = "1.3.7"  s.specification_version = 3  s.summary = "Full-stack web application framework."  s.version = Gem::Version.new("3.0.1")  end

Marshal.load Gem.inflate File.read 'rails-3.0.1.gemspec.rz'

Page 25: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...

deps with explicit version requirement => require full spec list

Page 26: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SPECS AKA „THE INDEX“$sudo gem install rails -V

GET http://gems.rubyforge.org/latest_specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/latest_specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz200 OKGET http://gems.rubyforge.org/specs.4.8.gz302 FoundGET http://production.s3.rubygems.org/specs.4.8.gz200 OKGET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz302 FoundGET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz

...for each dependencythen download and install the .gem files

Page 27: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: WHAT IF?

cli

rubygems.org

AWS S3

Page 28: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: WHAT IF?

cli

rubygems.org

AWS S3

Temporary Outage:

no new gem releasesno gem downloads (index missing)

!new app deployments?new server deployments?

Page 29: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: WHAT IF?

cli

rubygems.org

AWS S3

Fatal Outage, reasons:

• Hardware• Software (attack, fs corruption)• Amazon

• account „deactivation“• account deletion• S3 data loss• S3 bucket account theft/crack• Sunny day kills all the clouds.• Jeff Bezos‘ new bicy^Segw^Rocket.

Page 30: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: WHAT IF?

cli

rubygems.org

AWS S3

Fatal Outage:

ALL GEMS LOST

Page 31: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: WHAT IF?

cli

rubygems.org

AWS S3

Fatal Outage:

ALL GEMS LOSTtry again. ^_^

Page 32: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: MIRRORING

Infrastructure independence to save your business from a rubygems desaster:

> Start your own mirror

Fallback for rubygems.org desaster?

> Use a public mirror> Start your own mirror

Page 33: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: PUBLIC MIRRORS

Comprehensive Perl Archive Network2010-10-25 online since 1995-10-26

7770 MB 228 mirrors8463 authors 18582 modules

228 independent public and free mirrors!

Page 34: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: PUBLIC MIRRORS

Debian Mirror Sites: 445http://www.debian.org/mirror/list

Page 35: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: PUBLIC MIRRORS„The Python Package Index is a repository of software for the Python programming language.

There are currently 11801 packages here“

Page 36: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: PUBLIC MIRRORS

Page 37: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: PUBLIC MIRRORS

0 active, public, free mirrors.

lost in migration (rubyforge > gemcutter)

Page 38: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: MIRRORINGMirroring stuff in rubygems is currently broken:

• „gem mirror“ misses some gems & slow downloads: one gem at a time. • index building is broken (see #362)• reliability (#362, too)http://help.rubygems.org/discussions/problems/362-cant-mirror-rubygems-repo-incorrect-header-check

http://help.rubygems.org/discussions/problems/212-some-gems-and-specs-missing-that-are-in-the-index

Gemcutter already lost gems:

Page 39: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

PROBLEMS: MIRRORINGThere is also no easy way to mirror a S3 bucket:

• no ftp• no rsync• no file-list to use with e.g. wget

= you cannot even run a reliable private mirror :-(

Page 40: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SOLUTION

Provide rsync on master for sync-ability.On EC2, Rackspace, does not matter if it‘s fast...

> NO custom mirroring software!> most FOSS mirror sites use rsync> use rsync, ask mirrors, problem solved.

> AWS cloudfront is NOT a solution > not mirrorable, same vendor SPOFs.

Page 41: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SOLUTION

Provide rsync on master for sync-ability.On EC2, Rackspace, does not matter if it‘s fast...

Provide a DNS based distribution (GeoDNS)

> a realiable base for (private) mirroring

> speed & latency improvements

> NO custom mirroring software needed!

> saves money (AWS and Rackspace fees)> make use of the new mirrors!

Page 42: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SOLUTIONWhy not?

> no „instant deploy“ (real-time mirroring)

> no download stats

Page 43: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

SOLUTIONWhy not?

Rubygems CLI could fallback to the rubygems.org master if a gem version is not on the used mirror.

It already does if you configure it.(current downside: d/l spec-lists from master everytime, looks fixable to me)

> no „instant deploy“ (real-time mirroring)

> no download stats

Page 44: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

THINGS WILL FAIL...just make sure you‘ve a working plan B

AND:

KISS & YAGNI. Keep it simple.less moving parts > less things that will break.

Don‘t over-engineer.

Page 45: Rubygems - behind the gems

http://moriz.de/Rubygems behind the gems.

HELPOpenSource projects need your support.Gemcutter/Rubygems, too.

Go contribute if you care about your ruby business.

The Gemcutter source is really awesome, a good read for every developer.