harvard mit doe gtl center
DESCRIPTION
Harvard MIT DOE GtL Center. C.Ting. 2-20 μm. 7-Feb-2005 4:10-4:40 PM. Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory arep.med.harvard.edu. 0.6 μm. Molecular Systems Biology Access is free of charge. Transcriptomics Proteomics Metabolomics Functional genomics - PowerPoint PPT PresentationTRANSCRIPT
HarvardMIT DOEGtL
Center
Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory
arep.med.harvard.edu
C.Ting
7-Feb-2005 4:10-4:40 PM
2-20 μm
0.6 μm
Molecular Systems BiologyAccess is free of charge
Transcriptomics Proteomics
Metabolomics Functional genomics Structural genomics
Computational biology Theoretical biology
Mathematical biologySynthetic biology
www.nature.com/msb/
Harvard MIT DOE Center Projects Prochlorococcus Photosynthesis, circadian & cell cycles Escherichia Synthetic genomes/proteomesVibrio 4X faster replication than E.coliCaulobacter Asymmetric cell & chromsome structurePseudomonas Biofilms
Poster# Topic Goal# 2. Leptos, et al. Proteomics 1121. Nguyen, et al. Mass spectrometry XML 1122. Nguyen, et al. Gene Regulation 2 67. Thompson, et al. Vibrio diversity 3 68. Martiny, et al. Prochlorococcus diversity 3 77. Sullivan, et al. Cyanophage diversity 1,3 3. Zhang, et al. Single cell sequencing 1-4 1. Church, et al. Metabolic fluxes 4
arep.med.harvard.edu
Prochlorococcus 40ºN - 40ºS
Ocean chl a (Aug 1997 –Sept 2000)Provided by the SeaWiFS Project, NASA
Humans consume 2kW per person = 1010 kW.Sunlight hits the earth at 40,000 times that rate (70% ocean).
CO2 370 ppm = 730 x1015 g globally, increase ~3 x1015 /yr.Ocean productivity = ~100 x1015 g CO2/yr … due to
Autotrophs: 1026 Prochlorococcus cells globally (108 per liter)
Sequestration v. respiration v. use: heterotrophs (Pelagibacter), phages, predators (Maxillopoda, Malacostraca, herring)
Energy & CO2 Sequestration
http://www.gsfc.nasa.gov/gsfc/service/gallery/fact_sheets/earthsci/terra/earths_energy_balance.htmhttp://clear.eawag.ch/models/optionenE.html http://en.wikipedia.org/wiki/CopepodMorris et al. Nature 2002 Dec 19-26;420(6917):806-10. http://hosting.uaa.alaska.edu/mhines/biol468/pages/carbon.htmlhttp://www.aeiveos.com/~bradbury/Papers/PhotosyntheticEfficiency.html
0.1 0.1 mm6 cm
Diel (circadian) cycleLight output for sun-box: 14hr light – 10hr dark, 230 E at peak
Zinser, Lindell,Zinser, Lindell, Chisholm, Chisholm, Leptos, Jaffe, Lin,Leptos, Jaffe, Lin, et al.et al.
Light Dark DarkLight
No
rma
lize
d e
xpre
ssio
n
Time (Hours)
Diel Expression: All genes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.2
1.5
2.0
2.5
3.0
4.0
5.0
Trust
Zinser et al. unpublZinser et al. unpubl..
-Glc-1P ADP-Glc -1,4-glucosyl-glucan glycogenCentralCarbonMetabol.
glgC
glgX
glgA glgB
glgP
Glycogen metabolism
Time (hours)
0 4 8 12 16 20 24 28 32 36 40 44 48
Nor
mal
ized
Exp
ress
ion
0.1
1
10
glgAglgBglgCglgXglgP
Zinser et al. unpublZinser et al. unpubl..
Light regulated Prochlorococcus metabolism
Oxygenic Photosynthesis
psbApsbA=D1=D1
D2D2HLIP=HLIP=High High Light Light Induced ProteinInduced Protein
Pc= PlastocyaninPc= Plastocyanin
Fd=Fd=FerridoxinFerridoxin
H2O O2
NADPH
e-e-
PSIPSII
H2O O2H2O O2
NADPH
e-e-
PSIPSIPSIIPSII
Core reaction Center ProteinsCore reaction Center Proteins
Photosynthetic Genes in Phage
Podovirus P-SSP7 46 kb
PC HLIPs Fd D1
12kb 24kb
PC HLIPs Fd D1
12kb 24kb
~500 bp
HLIPs D1 D2
6.4kb 2.8kb
~500 bp
Myovirus P-SSM4 181 kbHLIPs D1 D2
6.4kb 2.8kb
Lindell, Sullivan, Chisholm et al. 2004Lindell, Sullivan, Chisholm et al. 2004
HLIP D1
Myovirus P-SSM2 255 kb
RNA Responses to Phage
MED4-0682 (60 aa Conserved URF)
Phage SSP7 psbA
MED4 host psbA
Lindell,Lindell, Sullivan, Zinser, ChisholmSullivan, Zinser, Chisholm
Synthetic - homologous recombination
testing of DNA motifs
1.3 2.4 (1.3 in argR)
1.1 1.3
0.7 2.5
0.2 1.4
1.4 3.5
RNA Ratio (motif- to wild type) for each flanking gene
Bulyk, McGuire,Masuda,Church Genome Res. 14:201–208
Synthetic Genomes & Proteomes. Why?
• Test or engineer cis-DNA/RNA-elements •Access to any protein (complex) including post-transcriptional modifications• Affinity agents for the above.• Protein design, vaccines, solubility screens • Utility of molecular biology DNA -- RNA -- Protein
in vitro "kits" (e.g. PCR -- T7 -- Roche)
Toward these goals design a chassis:• 115 kbp genome. 150 genes.• Nearly all 3D structures known.• Comprehensive functional data.
(PURE) translation utility
Removing tRNA-synthetases, translational release-factors,RNases & proteases
Allows:
Selection of scFvs[antibodies] specific for HBV DNA polymerase using ribosome display. Lee et al. 2004 J Immunol Methods. 284:147
Programming peptidomimetic syntheses by translating genetic codes designed de novo. Forster et al. 2003 PNAS 100:6353
High level cell-free expression & specific labeling of integral membrane proteins. Klammt et al. 2004 Eur J Biochem 271:568
Cell-free translation reconstituted with purified components. Shimizu et al. 2001 Nat Biotechnol. 19:751-5.
Also: membrane incompatible expression & diverse amino-acids (>21)
in vitro genetic codes
5'
mS yU eU
UGGUUG CAG
AAC... GUU A 3'GAAACCAUG
fM TN V E
| | | | | || | |
5' Second base 3'
U
A
C
C U
mSyU
eU
A C U
G
A
0
500
1000
1500
2000
2500
3000
3500
30 40 50 60 70 80
3H-E dpm
time (min.)
fM yU mS eU E |
Forster, et al. (2003) PNAS 100:6353Zhang et al. (2004) Science. 303:371
80% average yieldper unnatural coupling.
eU = 2-amino-4-pentenoic acid yU = 2-amino-4-pentynoic acid mS = O-methylserine gS = O-GlcNAc–serine bK = biotinyl-lysine
Escherichia coli Mycoplasma 3D structureColiphage 29 DNA polymerase + +Coliphage P1 Cre recombinase - + >Coliphage Lox/Cre recombinase site - +Coliphage T7 RNA polymerase + + >Coliphage T7 RNA polymerase initiation site + + >Coliphage T7 RNA polymerase termination site + +RNase P RNA + -RNase P protein + + >RNase P site/RNA primer for DNA polymerase + +Small subunit 16S ribosomal RNA + +All 21 small subunit ribosomal proteins (1-21) + except 1,21 +Large subunit 5S ribosomal RNA + +Large subunit 23S ribosomal RNA + +Large subunit 23S rRNA G2445>m2G methylase: unknown ? -Large subunit 23S rRNA U2449>dihydroU synthetase: unknown ? -Large subunit 23S rRNA U2457>pseudoU synthetase ? -Large subunit 23S rRNA C2498>Cm methylase: unknown ? -Large subunit 23S rRNA A2503>m2A methylase: unknown ? -Large subunit 23S rRNA U2504>pseudoU synthetase ? -All 33 large subunit ribosomal proteins (1-7,9-11,13-25,27-36) + except 25, 30 +Translational initiation factor 1 + +Translational initiation factor 2 + +Translational initiation factor 3 + +Translational elongation factor Tu + +Translational elongation factor Ts + +Translational elongation factor G + +Translational release factor 1 + +Translational release factor 2 - +Translational release factor Gln methylase + +Translational release factor 3 - +Ribosome recycling factor + +33/45 Transfer RNAs (see Fig. 2) 29/33 +tRNA(I) C34>lysidine synthetase ? +tRNA(R) A34>I deaminase ? +tRNA(ASV) U34>cmo5U (=V) synthetase: unknown - -tRNA(R) U34>2sU Cys desulfurase - +tRNA(R) nm5U34 methylase ? +tRNA(R) U34>cmnm5U GTPase ? +tRNA(R) U34>cmnm5U synthetase ? +tRNA(R) cmnm5U34>nm5U,mnm5U synthetase ? -tRNA(R) G37 N1-methylase + +tRNA(RNIKM) A37>t6A N6-threonylcarbamoyl-A synthetase: unknown + -tRNA(CLFSWY) A37>i6A synthetase - +tRNA(CLFSWY) i6A37>s2i6A(ms2i6A) synthetase - +All 22 aminoacyl-tRNA synthetase subunits (20 enzymes) + except G subunit, Q + except G subunitMet-tRNA formyltransferase + +Chaperonin DnaK + +Chaperonin GroEL + +Chaperonin GroES + +
Total genes = 150Forster & Church
Oligos for 150 & 776
synthetic genes(for E.coli minigenome & M.mobile whole genome
respectively)
Up to 760K Oligos/Chip18 Mbp for $700 raw (6-18K genes)
<1K Oxamer Electrolytic acid/base 8K Atactic/Xeotron/Invitrogen Photo-Generated Acid Sheng , Zhou, Gulari, Gao (U.Houston) 24K Agilent Ink-jet standard reagents 48K Febit 100K Metrigen 380K Nimblegen Photolabile 5'protection Nuwaysir, Smith, Albert
Tian, Gong, Church
Improve DNA Synthesis CostSynthesis on chips in pools is 5000X less expensive per
oligonucleotide, but amounts are low (1e6 molecules rather than usual 1e12) & bimolecular kinetics slow with square of concentration decrease!)
Solution: Amplify the oligos then release them.
10 50 10 => ss-70-mer (chip)
20-mer PCR primers with restriction sites at the 50mer junctions
Tian, Gong, Sheng , Zhou, Gulari, Gao, Church Nature 2004
=> ds-90-mer
=> ds-50-mer
Improve DNA Synthesis Accuracyvia mismatch selection
Tian & Church Other mismatch methods: MutS (&H,L)
Computer Aided Design Polymerase Assembly Multiplexing (CAD-PAM)
Moving forward: 1. Tandem, inverted and dispersed repeats (hierarchical assembly, size-selection and/or scaffolding)2. Reduce mutations (goal <1e-6 errors) to reduce # of intermediates 3. 15kb to 5Mb by homologous recombination (Nick Reppas)4. Phage integrase site-specific recombination, also for counters.
Stemmer et al. 1995. Gene 164:49-53;Mullis 1986 CSHSQB.
50
75
125 225 425 825 … 100*2^(n-1)
All 30S-Ribosomal-protein DNAs(codon re-optimized)
Tian, Gong, Sheng , Zhou, Gulari, Gao, Church
1.7 kb
0.3 kb
s190.3kb
Nimblegen 95K chip
Atactic <4K chip
Improving synthesis accuracy
Method Bp/error
Chip assembly (PAM) 160 1Hybridization-selection 1,400 1MutS-gel-shift 10,000 2MutHLS cleavage 30,000 3 (10X better than PCR)
1. Tian, Church, et al. 2004 Nature 432:1050 2. Carr, Jacobson, et al. 2004 NAR 32:e162 3. Smith & Modrich 1997 PNAS 94:6847
Extreme mRNA makeover for protein expression in vitro
RS-2,4,5,6,9,10,12,13,15,16,17,and 21 detectable initially.
RS-1, 3, 7, 8, 11, 14, 18, 19, 20 initially weak or undetectable.
Solution: Iteratively resynthesize all mRNAs with less mRNA structure.
Tian & Church
20w 20m 17w 17m 16w 16m
10kd
W: wild-typeM: modified
Western blot based on His-tags
Safe Synthetic Biology
Church, G.M. (2004) A synthetic biohazard non-proliferation proposal.
http://arep.med.harvard.edu/SBP/Church_Biohazard04c.doc
1. Monitor oligo synthesis via expansion of Controlled substances, Select Agents, &/or Recombinant DNA
2. Computational tools are available; very small number of reagent, instrument & synthetic gene suppliers at present.
3. System modeling checks for synthetic biology projects
4. Multi-auxotroph, novel genetic code for the host genome, prevents functional transfer of DNA to other cells.
Marine Synechococcus
high light adaptedProchlorococcus
low light adaptedProchlorococcus
MIT9201GP2
MIT9107
SAR6TATL1a
ENATL1ENATL3
MIT9302MIT9312
MIT9202MIT9215
TATL1b
Pac 1ENATL7
ENATL4
MIT9211MIT9303
SAR139WH8112
SAR100WH8101
WH8012WH7805
SAR7
Synechococcus PCC6307
0.01
89
97
92
72
71
10078
66
84
Photosynthetic bacterial genomes
(for population genetics & proteomics)
MED4
NATL2A
SS120
MIT9313
WH8102
Monthlysamples
IsolateVibrios
Identity population ascluster of barcode genes
Quantification:population iscontinuously
present
Genomes:almost each
genome differentin typical sample
Additionalmarker gene:
highlydiverse
Hsp60 allelic diversity
020406080
100120140160
95 96 97 98 99 100
% nucleotide similarity
nu
mb
er
of
seq
uen
ce
Environmental population genomics(of a ribotype cluster)
Thompson, Polz, et al. (2005) Science
Sequencing single cells
Biome studies focus on single-cells because hard to grow in the lab, multiple DNAs & RNAs per cell, exchange genome subsets.
(Complementary to Biome shotgun and/or 100 kb BACs)
Many input molecules required to sequence one molecule. vs. one molecule sufficient to sequence via many copies of it.
Amplifying DNA from single cells
29 real-time amplification
No template control
Affymetrix quantitation of independent amplifications
Prochlorococcus & Escherchia
Zhang, Martiny, Chisholm, Church, unpub.
Polony Bead Sequencing Pipeline
In vitro libraries via paired tag
manipulation
Bead polonies via emulsion PCR
[Dre03]
Monolayered immobilization in acrylamide
Enrichment of amplified beads
SOFTWARE
Images → Tag Sequences
Tag Sequences → Genome
FISSEQ or “wobble”sequencing
Epifluorescence Scope with Integrated Flow
Cell
Mitra, Shendure, Porreca, Rosenbaum, Church unpub.
Read length needs for population surveys
Paired tags are separated by 1000 +/- 100 bases
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
8 10 12 14 16 18 20
Length of K-mer Reads (bp)
% o
f P
aire
d K
-mer
s w
ith
Un
iqu
ely
Ass
ign
able
Lo
cati
on
Bacterial
Metazoan
Polony Fluorescent In Situ Sequencing Libraries
Greg PorrecaAbraham Rosenbaum
1 to 100kb Genomic1 to 100kb Genomic
M
L R
M
PCRbead
Sequencingprimers
Selectorbead
2x20bp after MmeI (BceAI, AcuI)
Dressman et al PNAS 2003 emulsion
Cleavable dNTP-Fluorophore (& terminators)
Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65
Reduce
or
photo-cleave
Polony-FISSeq: up to 2 billion beads/slideCy5 primer (570nm) ; Cy3 dNTP (666nm)
Jay ShendureSelf Organizing Monolayer
High accuracy special case: homopolymers (e.g. AAA, CC, etc.)
• Use "compressed" tags , ACG = ACCG=ACCCG• Quantitate incorporation • Reversible terminators• FRET between adjacent 3' bases • Wobble primers, CTAGCGAGCTAGNNNNNNNNA
All five of these work.
• Maintenance of amplification fidelity using linear amplification from initial genomic fragment
• # of bases sequenced (total) 23,703,953
• # bases sequenced (unique) 73
• Avg fold coverage 324,711 X
• Pixels used per bead (analysis) ~3.6
• Read Length per primer 14-15 bp
• Insertions 0.5%
• Deletions 0.7%
• Substitutions (raw) 4e-5 • Throughput: 360,000 bp/min
Polony FISSeq Stats
Current capillary sequencing 1400 bp/min (600X speed/cost ratio, ~$5K/1X)
(This may omit: PCR , homopolymer, context errors)Shendure
Wobble vs Simple primer sequencing
1 vs 2.5 bp read/cycle of 4 bases
10 vs 14-200 bp reads
3e-3 vs 4e-5 non-homopolymer errors
3e-3 vs 1e-1 homopolymer errors
40 minutes per base tested =
60 hr per 20 cycles (20 hr, if 4 colors)
Harvard MIT DOE Center Projects Prochlorococcus Photosynthesis, circadian & cell cycles Escherichia Synthetic genomes/proteomesVibrio 4X faster replication than E.coliCaulobacter Asymmetric cell & chromsome structurePseudomonas Biofilms
Poster# Topic Goal# 1. Church, et al. Metabolic fluxes 4 2. Leptos, et al. Proteomics 1 68. Martiny, et al. Prochlorococcus diversity 3121. Nguyen, et al. Mass spectrometry XML 1122. Nguyen, et al. Gene Regulation 2 77. Sullivan, et al. Cyanophages 1,3 67. Thompson, et al. Vibrio diversity 3 3. Zhang, et al. Single cell sequencing 1-4
arep.med.harvard.edu