ngs: the...

62
NGS: the basics

Upload: others

Post on 18-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

NGS: the basics

Page 2: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Human genome sequence

June 26th 2000: official announcement of the completion of the draft of the human genome sequence (truly finished in 2004)

Costs: HGP:

3 billion $ 15 years

Celera: 200 million $

2 years

Craig Venter Francis Collins

Page 3: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

2004: 2 Requests for Application NIH

Current technologies are able to produce the sequence of a mammalian-sized genome of the desired data quality for $10 to $50 million; the goal of this initiative is to reduce costs by at least two orders of magnitude. It is anticipated that emerging technologies are sufficiently advanced that, with additional investment, it may be possible to achieve proof of principle or even early stage commercialization for genome-scale sequencing within five years.

A parallel RFA solicits grant applications to develop technologies to meet the longer-term goal of achieving four-orders of magnitude cost reduction in about ten years, so that a mammalian-sized genome could be sequenced for approximately $1000.

Page 4: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Increased efficiency: decreased costs

Exponential cost decrease

Page 5: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Efficient integration of each individual step to slash down the costs

Page 6: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Massively parallel sequencing Next generation sequencing

Key: direct sequencing of DNA without the bacterial cloning step From colonies to polonies

Page 7: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

454

Roche GS Flex

Page 8: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

454: Library preparation

Page 9: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Clonal amplification of single molecules

Emulsion PCR

Page 10: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

454: Sequencing by pyrosequencing

Page 11: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

GS Flex throughput (2011-2013)

Up to a million sequences 700 bp long (up to 1 kb)

in 23 hours

Page 12: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

454: Game over!

Jonathan Rothberg: “In the sequencing business, one needs to innovate or die. At 454 we were always first; first non-bacterial cloning, first commercialization, first next-gen individual human genome, Neanderthal, mammoth, deep sequencing, cancer sequencing, drug response studies, HIV, metagenomics, first drug target by whole genome sequencing, and many more firsts. Always innovating, always first."

Page 13: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

454: Game over!

In 2007, Roche acquired 454 for $155 million in cash and stock. Rothberg said that when Roche bought 454, the company was "two years ahead of everyone else," but after the purchase, "they lost that lead, no more firsts, no more innovation."

Page 14: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Rothberg strikes back!

Rothberg: "When I woke up and found Roche had bought 454 without me, I had to restart. It cost three years. We had to invent a new scalable way to sequence — ion semiconductor sequencing — and establish a clear path towards both truly low-cost and mobile sequencing." He went on to found Ion Torrent, which was bought by Life Technologies in 2010 for $375 million in cash and stock, and another $350 million based on milestones.

Page 15: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Ion Torrent

Page 16: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Simple Natural Chemistry

Page 17: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Fast Direct Detection

Nucleotides flow sequentially over Ion semiconductor chip Direct detection of natural DNA extension A few seconds per incorporation

Sensor Plate

Silicon Substrate Drain Source Bulk

dNTP

To column receiver

∆ pH

∆ Q

∆ V

Sensing Layer

H+

Rothberg J.M. et al Nature doi:10.1038/nature10242

Page 18: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Scalable Semiconductor Technology

Wafer Semiconductor Manufacturing

Chip Semiconductor Packaging

Chip Cross Section

Semiconductor Design

Page 19: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

The Chip is the Machine™

Scalability Simplicity Speed

Page 20: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Two machines, 5 chips

PGM 314

316 318

Proton P1

P2?

Page 21: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Ion Torrent Specs

314 Chip: 0.4 to 0.5 million reads, 30 to 100 Mb data 316 Chip: 2 to 3 million reads, 300 to 1000 Mb data 318 Chip: 4 to 5.5 million reads, 0.6 to 2 Gb data 200bp or 400bp reads, 2 to 7 hours

Proton P1: 60-80 million reads, up to 10 Gb data 200bp reads, 2-4 hours Proton P2: L’arlésienne!

Page 22: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Barcode read just before insert with Ion Torrent

Barcoded adapter Insert Biotin adapter

Barcode

Sequencing primer

Page 23: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Ion Torrent paired-end sequencing

Page 24: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Illumina genome analyzer, HiSeq, Miseq

(formerly Solexa)

Page 25: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solexa amplification step

Amplification and sequencing on a solid support

Page 26: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Sequencing by synthesis

CRT: cyclic reversible termination

Page 27: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Sequencing by synthesis

Amplification and sequencing on a solid support

Page 28: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Illumina: Primary data analysis

120 tiles per lane 480 images per lane and cycle: 36nt run = 138,240 images = 945 Gb 2x50nt run = 384,000 images = 1.3 Tb 2x100nt run = 768,000 images = 5.3 Tb

Page 29: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Image analysis (Illumina)

Image registration:

Get image coordinates congruent

Register images between cycles

Cluster identification

Template of cluster positions

created from first five cycles

A C

T G

Page 30: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Cluster identification

If neighboring clusters have identical sequences during first 5 cycles: one cluster

If neighboring clusters have different sequences during first 5 cycles: two clusters

As a consequence: Barcodes should not be included in the first bases otherwise the

probability of fusing two different clusters would be too high

Page 31: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Illumina paired-end sequencing

Page 32: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Barcoding with a single index (Illumina)

Page 33: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Barcoding with dual indexing (Illumina)

Page 34: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Illumina-Solexa throughput (End 2013)

Up to 3 billion sequences, up to 2*100 bp long in 11days (Hiseq2000)

Or 0.6 billion, 2*150 bp, in 40 hours (Hiseq2500) Or 12-55 million, 2*250, in 39 hours (Miseq V2) Or 22-25 million, 2*300, in 65 hours (Miseq V3)

Page 35: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid sequencing

Applied Biosystems

Page 36: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid Applied Library

Page 37: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid Applied Library

Emulsion PCR

Page 38: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid Applied Library

Page 39: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid Applied Sequencing

Page 40: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid Applied Sequencing

Page 41: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Solid throughput (Early 2009)

Up to 0.2 billion sequences up to 2*60 bp long

in 7 days

Page 42: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Complete Genomics

A human genome for 5,000$

Step1: fragment tagging

Page 43: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Complete Genomics

A human genome for 5,000$

Step2: Clonal DNA amplification

Page 44: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Complete Genomics

A human genome for 5,000$

Step3:Distribution over patterned substrate 1 billion spots per slide

Page 45: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Complete Genomics

A human genome for 5,000$

Step 4: Sequencing by ligation

Page 46: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Complete Genomics

A human genome for 5,000$

Step 5: Assembly

Page 47: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Complete Genomics

A human genome for 5,000$

Costs slashing: small volumes, «simple» equipment

Page 48: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Third Generation sequencing

Single molecule sequencing No PCR amplification

Page 49: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Helicos Bioscience

Single molecule fluorescent sequencing on a flow cell

Page 50: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Helicos

Cyclic reversible termination: single DNA molecule extended one base at a time, blocking fluorescent label removed and washed, and reiteration

Page 51: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Helicos

Improved cyclic reversible termination and single DNA molecule detection

Page 52: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Helicos throughput

Up to 1 Billion sequences On average 32 bp long

in 7 days

Page 53: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Pacific Biosciences

Long single molecule sequencing

Page 54: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Pacific Biosciences

The label is on the phosphate, and the label is captured transiently using a DNA polymerase tethered on a nanopore

Page 55: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Pacific Biosciences

Thousands of nanoguides concentrate light

The ZMW nanostructure provides excitation confinement in the zeptoliter (10−21 liter) regime

Page 56: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Pacific Biosciences

Label on the phosphate, not on the base

Page 57: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Pacific Biosciences

Real time detection of incorporation of each base on thousands of molecules

Page 58: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Pacific Biosciences throughput

Each pore: 10 bases/sec Claim: in 2013, a high quality human

genome in 15 minutes

Page 59: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Third or Fourth generation sequencing

Single molecules, no fluorophore Oxford Nanopore Technology

Page 60: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Oxford Nanopore

Nanopore Array chip Pore across lipid bilayer

Exonuclease

Bases passing through the pore generate a change in the electrical conductance of the membrane allowing electrical measurements. A, T, G, C and MeC can be distinguished.

Page 61: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

Oxford Nanopore

Page 62: NGS: the basicsbiow.sb-roscoff.fr/ecole_bioinfo/training_material/general_talks/NGS_intro_Thierry...Next generation sequencing . Key: direct sequencing of DNA without ... semiconductor

There are several more possibilities in the pipelines

BioNanomatrix VisiGen

Dover Systems Intelligent Bio-Systems

ZS Genetics Reveo

LightSpeed Genomics