genomes on rails

123
Genomes on Rails has_many :sequences

Upload: matt-wood

Post on 24-May-2015

3.717 views

Category:

Technology


0 download

DESCRIPTION

Originally given at RailsConf, this talk outlines how the Wellcome Trust Sanger Institute is using Ruby and Rails as part of their new sequencing platform.

TRANSCRIPT

Page 1: Genomes On Rails

Genomes on Railshas_many :sequences

Page 2: Genomes On Rails

Hello

Page 3: Genomes On Rails

➊ Previously

➋ Production

➌ Process

Page 4: Genomes On Rails

➊ Previously

Page 5: Genomes On Rails

The human genome

15 years to decode

3 billion letters

Page 6: Genomes On Rails

$3 billion

Page 7: Genomes On Rails

$3 billion ++

Page 8: Genomes On Rails

Race for the prize

Page 9: Genomes On Rails
Page 10: Genomes On Rails
Page 11: Genomes On Rails

Open data

Page 12: Genomes On Rails

Open source

Page 13: Genomes On Rails

Perl

Page 14: Genomes On Rails

Lots of Perl

Page 15: Genomes On Rails

Lots of Perl~4500 modules

Page 16: Genomes On Rails

Onwards!

Page 17: Genomes On Rails

40 species

Page 18: Genomes On Rails
Page 19: Genomes On Rails
Page 20: Genomes On Rails
Page 21: Genomes On Rails

Map evolutionaryspace

Page 22: Genomes On Rails

Compare genomes

Page 23: Genomes On Rails

Compare genomes

compare species

Page 24: Genomes On Rails

Compare genomes

compare species

compare individuals

Page 25: Genomes On Rails

More Perl~1500 modules

Page 26: Genomes On Rails
Page 27: Genomes On Rails
Page 28: Genomes On Rails
Page 29: Genomes On Rails

Quantum leap!

Page 30: Genomes On Rails

1000 personal genomes

Page 31: Genomes On Rails

1000 personal genomes

beyond 23andme

Page 32: Genomes On Rails

Hypertension

Page 33: Genomes On Rails

Diabetes

Page 34: Genomes On Rails

Coronary heart disease

Page 35: Genomes On Rails

Bipolar disorder

Page 36: Genomes On Rails

Malaria

Page 37: Genomes On Rails

➋ Production

Page 38: Genomes On Rails

Register projects

Register samples

Sample prep

Sequencing

Analysis

Page 39: Genomes On Rails
Page 40: Genomes On Rails
Page 41: Genomes On Rails
Page 42: Genomes On Rails

Change!

Page 43: Genomes On Rails

Flexible data capture

Page 44: Genomes On Rails

Virtual fields

Page 45: Genomes On Rails

Sample

Name

Organism

Concentration

Page 46: Genomes On Rails

class Sample < ActiveRecord::Base has_many :descriptors has_many :descriptor_valuesend

Page 47: Genomes On Rails

Key value pairs

Page 48: Genomes On Rails

Faster than you’d think

Page 49: Genomes On Rails
Page 50: Genomes On Rails
Page 51: Genomes On Rails

Change!

Page 52: Genomes On Rails

Sample

Name

Organism

Concentration

Sample

Name

Organism

Concentration

Origin

Quality metric

V1 V2

Page 53: Genomes On Rails
Page 54: Genomes On Rails
Page 55: Genomes On Rails

Rationalize!

Page 56: Genomes On Rails

Sample

Name

Organism

Concentration

Sample

Name

Organism

Concentration

Origin

Quality metric

V1 V2

Page 57: Genomes On Rails

Mapping!

Page 58: Genomes On Rails

Sample

Name

Organism

Concentration

Sample

Name

Species

Concentration

Origin

Quality metric

V1 V3

Origin

Page 59: Genomes On Rails

Pipeline management

Page 60: Genomes On Rails

Task 1 Task 2 Task 3

Workflow

Name

Operator

Instrument

Name

Serial number

Kit

Name

Passed

Page 61: Genomes On Rails
Page 62: Genomes On Rails
Page 63: Genomes On Rails
Page 64: Genomes On Rails

Throughput!

Page 65: Genomes On Rails
Page 66: Genomes On Rails

320Tb 450 CPU

Page 67: Genomes On Rails

320Tb 450 CPU Archive

Page 68: Genomes On Rails

75Tb

Page 69: Genomes On Rails
Page 70: Genomes On Rails
Page 71: Genomes On Rails
Page 72: Genomes On Rails
Page 73: Genomes On Rails

pilot study!

Page 74: Genomes On Rails

Multiple apps

Page 75: Genomes On Rails

Multiple instances

Page 76: Genomes On Rails

Loosely coupled

Page 77: Genomes On Rails

Loose coupling is hard

Page 78: Genomes On Rails

Deployment

Page 79: Genomes On Rails

Maintenance

Page 80: Genomes On Rails

Monitoring

Page 81: Genomes On Rails

Hard to maintain separation

Page 82: Genomes On Rails

Support novel science

Page 83: Genomes On Rails

Single code base

Page 84: Genomes On Rails

nginx reverse proxy

Page 85: Genomes On Rails

fairnginx

Page 86: Genomes On Rails

Mongrel

Page 87: Genomes On Rails

Fast deployment

Page 88: Genomes On Rails

Automate everything

Page 89: Genomes On Rails
Page 90: Genomes On Rails

Interoperability!

Play well with others!

Page 91: Genomes On Rails

Legacy databases

Page 92: Genomes On Rails

RESTful services

Page 93: Genomes On Rails

Generate API stubs

Page 94: Genomes On Rails
Page 95: Genomes On Rails

SCALE!

Page 96: Genomes On Rails

Trillionics

Page 97: Genomes On Rails

2X

Page 98: Genomes On Rails

150Tb per week

Page 99: Genomes On Rails

Over 6 months

Page 100: Genomes On Rails

More hardware

Page 101: Genomes On Rails

400 additional nodes

Page 102: Genomes On Rails

additional 360 Tb

Page 103: Genomes On Rails

Towards a Virtual Institute

Page 104: Genomes On Rails

Lots of data

Page 105: Genomes On Rails

Lots of data, lots of people

Page 106: Genomes On Rails

Lots of data, lots of people, lots of compute

Page 107: Genomes On Rails

Lots of data, lots of people, lots of compute,

lots of uses

Page 108: Genomes On Rails

Lots of data, lots of people, lots of compute, lots of uses, lots and lots

and lots and lots...

Page 109: Genomes On Rails

➌ Process

Page 110: Genomes On Rails

Concept Requirements Development Product

Page 111: Genomes On Rails

Concept Requirements Development Product

takes too long

Page 112: Genomes On Rails

RequirementsConcept Development Product

these change

takes too long

Page 113: Genomes On Rails

Concept

What we need Get ready

DevelopmentPlan

REVIEW

Page 114: Genomes On Rails

Focused

Page 115: Genomes On Rails

Project owner is key

Page 116: Genomes On Rails

Weekly releases

Page 117: Genomes On Rails

More flexible

Page 118: Genomes On Rails

Less time

Page 119: Genomes On Rails

Better transparency

Page 120: Genomes On Rails

Less software

Page 121: Genomes On Rails

Sequencing informatics

Page 122: Genomes On Rails

Thank you

Page 123: Genomes On Rails

GREENISGOOD.CO.UK