gi dngenomics and new-g tis ib dgeneration sequencing ...dna sequencing maxam&gilbert sanger...

12
G i dN G ti S i B d Genomics and New-Generation Sequencing Based Big-Data Biology 国立情報学研究所・国立遺伝学研究所 藤山秋佐夫

Upload: others

Post on 20-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

G i d N G ti S i B dGenomics and New-Generation Sequencing Based

Big-Data Biologyg gy

国立情報学研究所・国立遺伝学研究所

藤山秋佐夫

Page 2: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

50

60

(Tb)

公開されているデータ登録量(トップ10)(データソース:DDBJ Sequence Read Archive)

20

30

40登録塩基数(

0

10

20

累計登

40,000 

2008 2009 2010 2011 2012

登録年SC JGI BI BGI BCM BCM‐HGSC UMIGS CSHL NIG WUGSC

25,000 

30,000 

35,000 

,

Gb) 遺伝研におけるデータ生産量の積算値

5 000

10,000 

15,000 

20,000 

塩基数(G

5,000 

806

808

810

812

902

904

906

908

910

912

1002

1004

1006

1008

1010

1012

1102

1104

1106

1108

1110

1112

1202

1204

1206

1208

年月年月GAIIx 1号機 GAIIx 2号機 GAIIx 3号機 HiSeq1号機 HiSeq2号機HiSeq3号機 HiSeq4号機 HiSeq5号機 HiSeq6号機

すばる望遠鏡の出力データ量2002年2月 55.1 GB/night x 365= > 20 PB/Year

天文月報 2002年6月272‐277

Page 3: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

1953 Double Helix Evolution of DNA/RNA Sequencing Technology

1960

1970DNA Recombination in vitro

RNA World

Chemistry/Physics Biochemistry/Molecular Biology

1980DNA Sequencing Maxam&Gilbert Sanger

Slab gel

y y y gy

1990

PCR

Automated DNA Sequencer

Human Genome Project ABI 3731990

2000

Human Genome ProjectABI 377

ABI 3700Capillary2000

1st NGS

ABI 3730xl

Surface GS20

FLX Helicos

GASOLiD

2010

?

Solid stateFLX Ti

PGM

PacBio RS

HiSeq2000, SOLiD4

MiSeq

Page 4: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

シークエンシングDNA 断片化 DNAライブラリ クローン選抜

111μg DNA ~10 molecules ~

11

ACTGTGTANGT・・・・・GATGCTAGTGCGA・・AGCTGATGCGTA・・・AGCTGTTACGTACG・ATCGTTGGCATTG・・・ATCGTTGGCATTG・・・TGCTGTATGTCAG・・

・・・

アダプター結合 固定化 増幅 シークエンシング

ゲノムDNA配列決定の流れ上が典型的なサンガー法によるシークエンシング過程で、下部に示したのが次世代型シークエンシングの一般的な過程である。

Page 5: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

Advance of the Sequence Productivity(bp/day)Log(bp)

12

FLX

GAIIxHi-SEQ

NGSs

10

454

GA

SOLiD

Sanger Technology8

Capillary

4

6Slab Gel

2

4

0

2

01990 1995 1998 2001 2005 2011

Page 6: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

900 Read Length(bp) from different sequencing platform

800

900

Sanger Technology

Read Length(bp) from different sequencing platform 

600

700

Capillary

FLX+

400

500 Slab Gel

bp

300

400FLX Ti

bp

100

200

454

FLX

GAIIxION

0454

GA

GAIIx

Page 7: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

Next-Generation Genome Analyzers OutputsInputs and Samples

Biological Processesthat Concern Nucleic Acids

Genome Analyzers OutputsInputs and Samples

DNA DNA

Genomic VarianceWholeTargeted

> 106 - 108

Sequenced DNA Fragments of

DNA DNA

RNA

TargetedMeta genomics

EnvironmentSymbiosis Fragments of

50 – 500 ntdsDNA Complex

SymbiosisMutation detection

DNA ModificationMethylation

High-Perfomance

RNA Complex y

HistonesNuclear compartmentother g

Computing

SequencingBi l

Fragment countReplicationRecombination

De novoRe-seqStructural variants

BiologyMedicine

Agriculture

NucleosomesTranscriptome

ncRNAsRNA M difi ti /TC F t SNPs/indels

Fragment count

・・

RNA Modifications/TC FactorsAncient DNA? Whatever you think of...

Page 8: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human
Page 9: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

IlluminaGAIIx

Roche/454GS-FLX

ABI/HITACHI3730xl

Illumina/SolexaHiSeq-2000

ABI5500xl

Illumina/SolexaMiSEQ

ABI/Ion TorrentIon PGM

PACBIOPACBIO RS

Page 10: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

塩基数(x10 bp)12

次世代シーケンサによる配列データ生産(累計)HiSeq-5号機

20 次世代シ ケンサによる配列デ タ生産(累計)

15

HiSeq-4号機

10

HiSeq-3号機5HiSeq-2号機

HiSeq-1号機GAIIx 3号機

GAIIx-1, 2号機

6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3

GAIIx-3号機0

2008 2009 2010 2011 2012

Page 11: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

主要ゲノムセンターの設備

BGIBI

140

160

BGI

120

の総数

80

100000+GAIIx)

WU60

80

a (HiSeq20

日本国内に

Sanger

WU

20

40

Illumina 日本国内に

は数百台!

ゲノム支援

Baylor

0

20

0 50 100 150 2000 50 100 150 200

第二・第三世代シーケンサーの総数

Page 12: Gi dNGenomics and New-G tiS iB dGeneration Sequencing ...DNA Sequencing Maxam&Gilbert Sanger Slab gel yy ygy 1990 PCR Automated DNA Sequencer Human Genome Project ABI 373 2000 Human

先端ゲノミクス推進センタ先端ゲノミクス推進センター&

生命情報研究センター比較ゲノム解析研究室比較ゲノム解析研究室