next generation dna sequencing: technology and … generation dna sequencing: technology and...
TRANSCRIPT
Next generation DNA sequencing:
technology and applicationsRobert Lyle
Department of Medical GeneticsUllevål University Hospital
DNA methylation
CpG dinucleotides
Histone modifications
acetylation
phosphorylation
methylation
ubiquitination
Epigenetics
Control of gene expression
Broadly...
DNA methylation
Long-term epigenetic silencing of specific sequences
transposons, imprinted genes, pluripotency genes
Histone modifications
Short term, flexible epigenetic control
Epigenetics in health and disease I
Imprinting disorders
Prader-Willi/Angelman syndrome, Beckwith Wiedemann etc.
Uniparental disomy (UPD)
Monogenic disorders
ICF syndrome - involves immunodeficiency
mutations in DNMT3B - DNA methyltransferase
Rett syndrome - mutations in MeCP2
Cancer
complex DNA hypo- and/or hypermethylation
Environmental interactions
Disease susceptibility
Assisted reproductive technologies
Cloning
Epigenetics in health and disease II
Why are identical twins not identical?
Genetic?
Environmental
‘Stochastic’
Epigenetics
Genomic sequence is the same*
*ignoring somatic mutation: SNPs, CNVs
Are identical twins (epi)genetically identical?
Genomic sequence is the same (except for mutations).
Twins epigenetically ‘drifting apart’
Epigenotype changes over time
DNA methylation
Discordance rates among MZ twins is ~50% for many immune-mediated diseases with large genetic component - psoriasis, asthma, IBD
Understand the basis for this discordance rate
Identify epigenetic differences between twins discordant for immune-mediated diseases
Project
Strategy
Collect monozygotic twins discordant for immune-mediated diseases
Genome-wide epigenetic surveying (GWES)
DNA methylation - bisulphite sequencing
Histone status - chromatin immunoprecipitation
High-throughput sequencing (>> 1 Gb per run)
Twin collection
Protocols
Ethics
Consent
Twin group Number, pairs Status
Control (MZ/DZ) lots 85 samples
Asthma 122* consents-contact
Psoriasis 74* consents-contact
IBD 15* consents-contact
* discordant
Sample collection
cells, RNA, DNA
AutoMACS Proautomated cell separation
Processed >200 samples to date
Quantifying DNA methylation
AGCTGTCGATTAGCCG
AGCTGTCGATTAGCCG AGTTGTTGATTAGTTG
AGTTGTCGATTAGTTG
genomic DNA
1. bisulphite treat2. PCR region of interest3. sequence
methylated
unmethylated
m
Bisulphite sequencing (BiS)
Generally low-throughput - single/several loci
High-throughput genome-wide?
DNA methylation variation
Control MZ and DZ twins
Regions within major histocompatability complex
Identify variation
How much variation under genetic control?
60 individuals
190 regions
Patterns of DNA methylation in the MHC
60 unrelated individuals, 190 regions, CD4+ cells
uus14
Position
MCpG
0.0
0.2
0.4
0.6
0.8
1.0
40 60 80 100 120 140
Variation in MZ twins
UUS14, 4 MZ pairs
Variation in region types
Different distribution of DNA methylation?
CpG islands Conserved non-coding 5’ genes Random
All differences are statistically significant
Massively parallel
Fragment Array
HTS
Sequence
...on one machine
4x10 - 1x105 9
Fragment Clone/PCR Sequence
1, 48, 96...
LTS
...unless you havea lot of machines
Moleculessequenced
Sequencing: old and next
1.800.809.4566 (TOLL- 3
Simple, Automated Workflow
Cluster Generation
5 hours
2
30 min. hands-on time(1–8 Samples)
Sequencing3
(1–8 Samples)30 min. hands-on time
2–3 days (single-read)4–6 days (paired-end)
6 hours 3 hours hands-on time
Library Prep 1
AA
CGAT
C
GG
ACGAT
C
GG
A
T
G
C
T
A
C
T
Attach DNA to flow cell
Perform bridge amplification
Generate clusters
Anneal sequencing primer
Extend first base, read, and deblock
Repeat step above to extend strand
Generate base calls
Fragment DNA
Repair ends/Add A overhang
Ligate adapters
Select ligated DNA
Solexa (and Helicos)
COMPANY ILLUMINA ROCHE APPLIED BIOSYSTEMS HELICOS
Company
Web
System
Technology
Sequencing method
Sequence reactions
Read length, bp
Sequence per run, Mb
System cost
Cost per run
Cost per Mb
Sequencing accuracy
Application features
Read length
Sample prep
Sample throughput
Sequencing accuracy
DNA methylation
Cost per run
Cost per Mb
System cost
Support cost
3 year cost
Illumina Roche ABI Helicos Biosciences
www.illumina.com www.454.com www.appliedbiosystems.com www.helicosbio.com
Solexa 454 Solid Helicos
GenomeAnalyzer 1G GenomeSequencer FLX SOLiD Analyzer HeliScope
Synthesis Pyrosequencing Ligation Single molecule, synthesis
4E+07 4E+05 6E+08 1E+09
50 200 35 25
20000 100 10000 50000
3,321,725 4,999,000 4,501,661 9,375,000
24,577 62,459 53,141 119,250
1.23 625 5 2.39
0.9994 0.9900
3 5 2 1
3 1 1 5
3 1 3 5
3 4 5 1
4 0 3 1
5 3 3 1
5 1 3 4
5 3 4 1
5 4 2 1
5 3 3 1
Total System Rating for UUS
Projects41 25 29 21
SOLEXA
Genome Analyzer II
Cluster Station
Paired-end module
Shipping/insurance
iPAR analyzer sever
Total
DNA Sample Kit (40)
DNA Sample Oligo Kit (100)
Cluster Generation Kits (10)
36 Cycle Sequencing Kit
Other
PhiX control (10)
Per sample
1 year service contract
150 samples
3000 Gb
System
Support (+2 years)
Reagents (setup/QC)
2326500
290813
290813
25850
387750
3,321,725
12350
6625
21875
1500
325
231
24577
281119
7,570,576
3,883,963
3321725
562238
983097
4,867,060
High-throughput sequencing: system comparison
COST BREAKDOWN
System
Consumables
Support
Total 3 year costs
Purchase cost
454
GenomeSequencer FLX
Data analysis cluster station
Installation
Training
Installation/training reagents
Additional equipment
Total
Library preparation kit (10)
emPCR kit I (16)
Sequencing kit (1)
Other (1)
Per sample
1 year service contract
150 samples
3000 Gb
3615500
310000
33500
105000
185000
750000
4,999,000
18090
8000
48300
11850
62459
350000
15,067,850
5,699,000
SOLID
SOLiD System
UPS
Training
PCR system
Total
Library oligos kit (10)
ePCR kit (10)
Bead deposition kit (10)
Bead enrichment kit (10)
Slide kit (24)
Buffer kit (10)
Library Sequencing Kit (8)
Sequencing Probes Kit (8)
Per sample
1 year service contract
150 samples
3000 Gb
4285803
50678
70000
95181
4,501,661
53141
409180
13,291,209
5,320,021
HELICOS
HeliScope
Total
tSMS Sequencing kit
Per sample
1 year service contract
150 samples
3000 Gb
9375000
9,375,000
119250
119250
1007500
29,277,500
11,390,000
Illumina - 20 Gb single run (~6x human genome)
Applications
Application Project
Resequencingwhole genome
linkage/associationmutation detection
de novo sequencing metagenomicsnew species
Expressiontranscriptome
SAGEmiRNA
EpigeneticsDNA methylation
ChIP
Variation SNPsCNVs
Important issues...
EU tender process complete
Data storage (> 1 Tb per run)
Bioinformatics
Core facility
Link to 454 at CEES/UiO?
Illumina GA II
PeopleRobert Lyle Medical Genetics, UUS Principal investigator
Dag Undlien Medical Genetics, UUS/UiO Principal investigator
Jennifer Harris Epidemiology, NIPH/NIH Investigator
Gregor Gilfillan Medical Genetics, UUS Post-doc
Kristina Gervin Medical Genetics, UUS PhD student
Heidi Nygård Medical Genetics, UUS Nurse/field worker
Ingunn Brandt Epidemiology, NIPH Twins DB
Hanne Akselsen Medical Genetics, UUS Technician
Martin Hamerø Medical Genetics, UUS Technician
Rune Moe Medical Genetics, UUS Technician
Anne Olaug Olsen Dermatology, UUS Clinician psoriasis
Monica Cheng Munthe-Kaas Pediatrics/Medical Genetics, UUS Clinician asthma
Torbjørn Rognes Institute of Bioinformatics, UiO Bioinformatics
Sigve Nakken CMBN, UiO PhD student bioinformatics
Hans-Christian Åsheim Medical Genetics, UUS Immunology
Thore Egeland Medical Genetics, UUS Statistics