sequencing: the next generation

38
Surya Saha Cornell University & Boyce Thompson Institute [email protected] // Twitter:@SahaSurya IIT Indore May 29, 2014 Slides: http://bit.ly/IITIndoreSeq http://www.acgt.me/blog/2014/3/7/next-generation-sequencing-must-die

Upload: surya-saha

Post on 01-Jul-2015

665 views

Category:

Technology


0 download

DESCRIPTION

Talk given at IIT Indore, India on May 29, 2014

TRANSCRIPT

Page 2: Sequencing: The Next Generation

5/29/2014 IIT Indore 2

You are free to:

Copy, share, adapt, or re-mix;

Photograph, film, or broadcast;

Blog, live-blog, or post video of;

This presentation. Provided that:

You attribute the work to its author and

respect the rights and licenses associated

with its components.

Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with

permission from originals by Christopher Ross. Original images are available under GPL at

http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites

Page 3: Sequencing: The Next Generation

19

53

DNA Structure discovery

19

77

20

12

Sanger DNA sequencing by chain-terminating inhibitors

19

84

Epstein-Barr virus

(170 Kb)

19

87

Abi370

Sequencer

19

95

20

01

Homo sapiens (3.0 Gb)

20

05

454

Solexa

Solid

20

07

20

11

Ion Torrent

PacBio

Haemophilus influenzae (1.83 Mb)

20

13

Slide credit: Aureliano Bombarely

Sequencing over the Ages

Illumina

Illumina Hiseq X

454

5/29/2014 IIT Indore 3

Pinus taeda

(24 Gb)

20

14

MinION

Page 4: Sequencing: The Next Generation

5/29/2014 IIT Indore 4

Its all about the $£€¥

http://www.genome.gov/sequencingcosts/

Page 5: Sequencing: The Next Generation

5/29/2014 IIT Indore 5

First generation sequencing

Page 6: Sequencing: The Next Generation

Sanger method

5/29/2014 IIT Indore 6

Frederick Sanger 13 Aug 1918 – 19 Nov 2013 Won the Nobel Prize for Chemistry in 1958 and 1980. Published the dideoxy chain termination method or “Sanger method” in 1977

http://dailym.ai/1f1XeTB

Page 7: Sequencing: The Next Generation

Sanger method

5/29/2014 IIT Indore 7

http://bit.ly/1g6Cudq

http://bit.ly/1lcQO4J

Page 8: Sequencing: The Next Generation

Maxam-Gilbert method

5/29/2014 IIT Indore 8

Page 9: Sequencing: The Next Generation

Maxam-Gilbert method

5/29/2014 IIT Indore 9

http://bit.ly/1noY0fu http://bit.ly/1lGvJCA

Page 10: Sequencing: The Next Generation

First generation sequencing

• Very high quality sequences (99.999%)

• Very low throughput

5/29/2014 IIT Indore 10

Run Time Read Length Reads / Run

Total

nucleotides

sequenced

Cost / MB

Capillary

Sequencing

(ABI3730xl)

20m-3h 400-900 bp 96 or 386 1.9-84 Kb $2400

http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd

Page 11: Sequencing: The Next Generation

Next generation sequencing

5/29/2014 IIT Indore 11

Page 14: Sequencing: The Next Generation

454 Pyrosequencing

One purified DNA fragment, to one bead, to one read.

5/29/2014 IIT Indore 14

http://bit.ly/1ehwxWN

GS FLX Titanium

http://bit.ly/1ehAcEh

Page 15: Sequencing: The Next Generation

Illumina

5/29/2014 IIT Indore 15

Output 15 Gb 120 GB 1000 GB 1800 GB

Number of Reads

25 Million 400 Million 4 Billion 6 Billion

Read Length

2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014)

2x150 bp

Cost $99K $250K $740K $10M

Source: Illumina

Page 16: Sequencing: The Next Generation

Illumina

5/29/2014 IIT Indore 16

Output 15 Gb 120 GB 1000 GB 1800 GB

Number of Reads

25 Million 400 Million 4 Billion 6 Billion

Read Length

2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014)

2x150 bp

Cost $99K $250K $740K $10M

Source: Illumina

$1000 human genome??

Page 17: Sequencing: The Next Generation

Illu

min

a

5/29/2014 IIT Indore 17 http://1.usa.gov/1fP9ybl

Page 19: Sequencing: The Next Generation

Pacific Biosciences SMRT sequencing

Single Molecule Real Time sequencing

5/29/2014 IIT Indore 19

http://bit.ly/1naxgTe

Page 20: Sequencing: The Next Generation

Pacific Biosciences SMRT sequencing Error correction methods

5/29/2014 IIT Indore 20

Hierarchical genome-assembly process (HGAP)

PB

Jelly

Enlish et al., PLOS One. 2012

PBJelly

Page 21: Sequencing: The Next Generation

5/29/2014 IIT Indore 21

Pacific Biosciences SMRT sequencing Read Lengths

http://www.igs.umaryland.edu/labs/grc/

Mean Read Length: 8391 bp Maximum Subread Length: 24585 bp

Page 22: Sequencing: The Next Generation

Oxford Nanopore

5/29/2014 IIT Indore 22

https://www.nanoporetech.com/

• No data yet

• Error model

http://erlichya.tumblr.com/post/66376172948/hands-on-experience-with-oxford-nanopore-minion

Page 23: Sequencing: The Next Generation

Others

• Ion Torrent Proton/PGM

• Nabsys

• SOLiD

5/29/2014 IIT Indore 23

Page 24: Sequencing: The Next Generation

Comparison

5/29/2014 IIT Indore 24

Page 25: Sequencing: The Next Generation

Next generation sequencing

5/29/2014 IIT Indore 25

Run Time Read Length Quality

Total

nucleotides

sequenced

Cost /MB

454

Pyrosequencing 24h 700 bp Q20-Q30 0.7 GB $10

Illumina Miseq 27h 2x250bp > Q30 15 GB $0.15

Illumina Hiseq

2500 11days 2x125bp >Q30 1000 GB $0.05

Ion torrent 2h 400bp >Q20 50MB-1GB $1

Pacific

Biosciences 2h 5.5-8.5kb

>Q30 consensus

>Q10 single

400-800MB

/SMRT cell $0.33-$1

http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd

Page 26: Sequencing: The Next Generation

http://omicsmaps.com/

Next Generation Genomics: World Map of High-throughput Sequencers

IIT Indore 5/29/2014 26

Page 27: Sequencing: The Next Generation

5/29/2014 IIT Indore 27

http://bit.ly/18pfUId

Page 28: Sequencing: The Next Generation

5/29/2014 IIT Indore 28

http://bit.ly/18pfUId

Page 29: Sequencing: The Next Generation

Real cost of Sequencing!!

Sboner, Genome Biology, 2011

IIT Indore 5/29/2014 29

Page 30: Sequencing: The Next Generation

Library Types

Single end

Pair end (PE, 150-800 bp, Fwd:/1, Rev:/2)

Mate pair (MP, 2Kb to 20 Kb)

5/29/2014 IIT Indore 30

F

F R

F R 454/Roche

F R Illumina

Illumina

Slide credit: Aureliano Bombarely

Page 31: Sequencing: The Next Generation

Implications of Choice of Library

5/29/2014 IIT Indore 31 Slide credit: Aureliano Bombarely

Consensus sequence

(Contig)

Reads

Scaffold

(or Supercontig)

Pair Read information

NNNNN

Pseudomolecule

(or ultracontig)

F

Genetic information (markers)

NNNNN NN

Page 32: Sequencing: The Next Generation

5/29/2014 IIT Indore 32

Quality control: Encoding

http://bit.ly/N28yUd

Phred score of a base is: Qphred = -10 log10 (e)

where e is the estimated probability of a base being incorrect

Page 33: Sequencing: The Next Generation

Which technology to use??

• Microbial genomes

• Eukaryotic genomes

• Resequencing genomes

• RNAseq and other XXXseq methods

5/29/2014 IIT Indore 33

http://bit.ly/1ko9Kgh

Page 34: Sequencing: The Next Generation

Looking into the Crystal ball

• Desktop sequencing

• Diagnostics in the clinic

• Large scale environmental sequencing of microbes

• But challenges remain..

5/29/2014 IIT Indore 34

Page 36: Sequencing: The Next Generation

5/29/2014 IIT Indore 36

• Collaborate with student organizations

• Organize workshops and journal clubs

• Attend international meetings

Page 37: Sequencing: The Next Generation

Position available at Solgenomics

Cassavabase project

Plant Breeding + Bioinformatician

● Familiar with breeding

● Programming in Perl, R, SQL, Hadoop

● Linux

● Africa

● Genius

http://www.cassavabase.org/forum/posts.pl?topic_id=9

Page 38: Sequencing: The Next Generation

Thank you!! Questions??

5/29/2014 BTI Plant Bioinformatics Course 2014 38