genomikáróláltalában, főbb metodikák, összehasonlításuk 2...
TRANSCRIPT
1. Genomikáról általában, főbb metodikák, összehasonlításuk
2. Második generációs szekvenálás alkalmazási területei 1.
3. Második generációs szekvenálás alkalmazási területei 2.
4. Mutagenezis technikák/ Harmadik generációs szekvenálás
Genomics is the study of the genomes of organisms.
Milestones:- Full sequence of f-X174 bacteriophage (5368 bp) 1977, Frederick Sanger -The first free-living organism to be sequenced was that of Haemophilus influenzae (1.8 Mb) in 1995, Hamilton Smith - Shotgun technique 1998, Celera Genomics- Fruit fly (Drosophila melanogaster) in 2000- Human in 2001 (3.3 Gb)- Human in good quality in 2007 (less than one error in 10,000 bases and all chromosomes assembled)-Today: more than 1000 prokaryotic genomes, more than 2500 viruses and around 100 eukaryotic genomes (more than half are fungi) are fully sequenced
Genomics
Functional genomics
Personal genomics
Metagenomics
Nutrigenomics
Psychogenomics
Nitrogenomics Hydrogenomics etc.
Pharmacogenomics
Functional genomics
Related to transcriptomics and proteomics
Only 1.5 % of the human genome encodes proteins (ca. 20 000 genes)
Personal genomics
Pharmacogenomics
Nutrigenomics
Psychogenomics (Behavioural genetics)
The role of genes in behaviour.
Metagenomics
Nitrogenomics
Techniques applied in Genomics
General features:- High-troughput- Generate huge amount of data- Data evaluation is often the main challenge
1. Microarray techniques
2. Sequencing techniques
1. Microarray techniques
Arrayed series of thousands of spots of DNA oligonucleotides, called probes. Probes can beshort sections of genes that are used to hybridize a cDNA sample (called target) under high-stringency conditions. Probe-target hybridization is usually detected and quantified bydetection of fluorophore- or chemiluminescence-labeled targets to determine relativeabundance of nucleic acid sequences in the target.
Microarray experiment
Microarray’s weaknesses
Ioannidis et al; Nat Genet 41 (2009)
- arrays prerequisite known sequences- examination of similar sequences is almost impossible by arrays (cross-hybridization, specificity issues)- array reproducibility shows huge fluctuations
2. Sequencing techniques
1. Traditional sequencing: non-high-throughput(not detailed here, see Gabor Rakhely’s lecture)1.1. Maxam-Gilbert method1.2. Chain termination method (Sanger method): key principle is the use of
dideoxynucleotide triphosphates (ddNTPs) as DNA chain terminators1.3. Shotgun sequencing (Sanger chemistry)
Chain termination method Shotgun seq
2. High-throughput sequencing NGS (next generation sequencing)
2.1. Second generation sequencing (pyrosequencing, sequencing byligation), this is the present454 Life Sciences: pyrosequencingIllumina: sequencing by synthesisABi SOLiD: sequencing by ligation
2.2. Third generation sequencing (nanopore sequencing, one moleculesequencing), this is the futurePacific BiosciencesHelicosComplete GenomicsEtc.
Need for library preparation in a host
• Labour and time - intensive, expensive
• Toxic regions are not represented
• Host genome contaminations
Low throughput
• strand synthesis and base determination are separated
• need for electrophoretic step
• high unit cost (cost/bp)
No need for library preparation in a host
• immobilized template fragments, PCR methods
• labour, time and cost effective
High throughput
• several millions of sequencing /run
• synthesis and sequencing are not separated
Sanger sequencing
NGS (Next Generation Sequencing)
No competition, but complementation
Long read, low coverage
Short read, huge coverage(especially SOLiD and Illumina)
Use in: De novo sequencing, validation
Use in: Resequencing, SNP analysis, RNA-Seq
Main characteristics of sequencing generations
Illumina platform
454 FLX
October 2007
SOLiD V 1.0
June 2008
SOLiD V 2.0
August 2010
SOLiD V 4
ABi SOLiD platform
Comparison of NGS technologies
Illumina
Mix DNA Library
& capture beads
(limited dilution)
454 FLX Technology
“Break micro-reactors”
Isolate DNA containing beads
• Generation of millions of clonally amplified sequencing templates on each bead
• No cloning and colony picking
Create
“Water-in-oil”
emulsion
+ PCR Reagents
+ Emulsion Oil
Perform emulsion PCR
Adapter carrying
library DNA
A
BMicro-reactors
Centrifuge Step
Load Enzyme
Beads
44 μm
Load beads into
PicoTiter™Plate
454 FLX Technology
Photons
Generated are
Captured by
Camera
Reagent Flow
PicoTiterPlate Wells
Sequencing
By Synthesis
(pyrosequencing)
Sequencing Image Created
454 FLX Technology
Enzymes needed:- DNA polymerase, ATP sulfurylase, luciferase, apyrase
Template: ssDNA
Addition of one of the four dNTPs in each step
Flow Order
• Count the photons generated for each “flow”
• Base call using signal thresholds
• Delivery of one nucleotide per flow ensures accurate base calling
TACG
1-mer
2-mer
3-mer
4-mer
KEY (TCAG)
Measures the presence
or absence of each
nucleotide at any given
position
454 FLX Technology: Basecalling
Summary of 454 FLX
• Read length: 350-450 bases
• Throughput: 400MB/slide/run
(average bacterial genome size: 5 MB)
• Homopolymer problem
(caused by proportionality of light intensity)
Illumina Technology
Step 1-6DNA Fragmentation
Adaptor ligationTemplate amplification
Illumina Technology
Step 7-12
Base determination
(sequencing by synthesis,
differently labeled nucletides,
laser excitation, fluorescence
detection)
Base imaging
Multiple cycles
Illumina Technology
Summary:
• Read length: 35-50 bases
• Throughput: 8 GB/flowcell/run
(1600x coverage on 5 MB bact. genome)
• High accuracy (no homopolymer issue)
Flow cell
SOLiD™ Chemistry