evidence of selection on genomic gc content in bacteria · 2011. 4. 11. · at gc = 1 . four-fold...
TRANSCRIPT
-
Evidence of Selection on Genomic GC Content in Bacteria!
Falk Hildebrand!Axel Meyer!
Adam Eyre-Walker!
-
Genomic G+C content!
-
Genomic GC content!
-
Codons!
123!ATA CCC!CTA CCT!
Non-synonymous
Synonymous
2-fold : TTT TTC
4-fold : CCT CCC CCA CCG
-
Variation!
0 20 40 60 80 1000.000
0.005
0.010
0.015
0.020
0.025
0.030
GC3
0 20 40 60 80 100
0.01
0.02
0.03
0.04
0.05
GC12
-
Correlations!
0.0 0.2 0.4 0.6 0.8 1.00.0
0.2
0.4
0.6
0.8
1.0
GC12
GC3
-
Explanations!
• Mutation bias!• Suoeka (1961) & Freese (1962)!• Intrinsic and/or extrinsic!
• Selection!• Many authors!
• Biased gene conversion!• Anonymous referees!
-
Evidence of selection!
• Escherichia coli!• Mutation pattern!
• 273 GCAT versus 131 ATGC!• Predicted GC content = 0.32!• Observed GC content = 0.50!• Observed GC at neutral sites = 0.58!
Lynch (2007) Origins of genome architecture
-
Test of mutation bias!
• If GC content is!• Due to mutation bias alone!• Stationary!• And the infinite sites assumption holds!
• Then!• # GCAT mutations = # ATGC mutations!
-
Orienting mutations!
Strain 1 ACT GCT TTG GCT TTA TGG!Strain 2 ACT GCT TTG GCT TTA TGA!Strain 3 ACT GCT TTG GCT TTA TGG!Strain 4 ACT GCT TTC GCT TTA TGA!Strain 5 ACC GCT TTC GCT TTA TGG!Strain 6 ACT GCT TTG GCT TTA TGG!
TC GC GA
GCAT = 1 ATGC = 1
-
Four-fold synonymous sites!
Genomic GC
GC4!
-
Data!
• Popset!• Keyword “bacteria”!• 8 or more sequences from same species!• 149 bacterial species!
• 8 phyla, 15 classes and 77 genera!• 1 or more genes!• 10 or more synonymous polymorphisms!• 4-fold diversity < 0.1!
-
Overall result!
No. of SNPs!
GCAT! 11045!
ATGC! 8309!
P
-
Bias versus GC4!
Z = GCAT
GCAT
No. species! Z > 0.5! P-value!GC-rich! 82! 69!
-
Phylogenetic distribution!Phylum! Class! No. of species! GC4 range! Mean Z
(GC40.34)!
Actinobacteria! Actinobacteria! 3! 0.64-0.93! no species! 0.64!
Bacteroidetes! Bacteroidetes! 3! 0.12-0.46! 0.43! 0.36!
Chlamydiae+! Chlamydiae! 2! 0.21-0.30! 0.45! no species!
Cyanobacteria! Chroococcales! 2! 0.38-0.51! no species! 0.53!
Cyanobacteria! Nostocales! 3! 0.26-0.31! 0.45! no species!
Cyanobacteria! Oscillatoriales! 2! 0.41! no species! 0.38!
Cyanobacteria! Stigonemales! 1! 0.40! no species! 0.59!
Firmicutes! Bacilli! 27! 0.085-0.68! 0.44! 0.58!
Firmicutes! Clostridia! 5! 0.050-0.28! 0.34! no species!
Proteobacteria! Alphaproteobacteria! 16! 0.099-0.94! 0.43! 0.65!
Proteobacteria! Betaproteobacteria! 6! 0.66-0.96! no species! 0.67!
Proteobacteria! delta/epsilon! 6! 0.15-0.99! 0.49! 0.78!
Proteobacteria!Gammaproteobacteria! 62! 0.095-0.95! 0.50! 0.66!
Spirochaetes! Spirochaetes! 7! 0.12-0.60! 0.45! 0.54!
Tenericutes! Mollicutes! 4! 0.023-0.24! 0.33! no species!
-
Potential problems!
• Infinite sites assumption!
• Sequencing error!
-
Zpred!
-
Z-Zpred!
No. of species! Z-Zpred > 0! P-value!
GC-rich! 82! 61!
-
Potential problems!
• Infinite sites assumption!
• Sequencing error!
-
Sequencing error!
No. of species! Z > 0.5! P-value!
GC-rich! 82! 60!
-
Explanations!
• Non-stationary base composition!• Selection for translational efficiency!• Biased gene conversion!• Selection upon base composition!
-
Explanations!
• Non-stationary base composition!• Selection for translational efficiency!• Biased gene conversion!• Selection upon base composition!
-
Non-stationary base composition!
-
Explanations!
• Non-stationary base composition!• Selection for translational efficiency!• Biased gene conversion!• Selection upon base composition!
-
Selection on codon usage!
Amino Acid! Codon! High usage! Low usage!Phenylalanine! UUU! 0.22! 0.71!
UUC! 0.78! 0.29!
Valine! GUU! 0.46! 0.36!GUC! 0.09! 0.19!GUA! 0.24! 0.23!GUG! 0.21! 0.23!
-
Translational efficiency!
No. of species! Z > 0.5! P-value!
GC-rich! 31! 29!
-
Explanations!
• Non-stationary base composition!• Selection for translational efficiency!• Biased gene conversion!• Selection upon base composition!
-
Biased gene conversion!
A T
C G
A G
C T
C G
C G
-
Four gamete test!
G A!G T!C A!
G A!G T!C A!C T!
No recombination Recombination
-
Biased gene conversion!
No. species! Z > 0.5! P-value!
GC-rich! 28! 19! 0.087!
GCAT! ATGC! P-value!
No. of SNPs! 1079! 844!
-
Biased gene conversion!
GC AT -w w
if New >> 1 BGC effective if New
-
Biased gene conversion!
r / m! p-value!
GC4! -0.076! 0.67!
GC4pred! -0.115! 0.52!
34 species with estimate of r / m Vos & Didelot (2009) ISME J.
-
Biased gene conversion!
θ r / m! p-value!
GC4! 0.039! 0.83!
GC4pred! -0.031! 0.86!
€
πsrm
= 2Neurm
= 2Nerk
-
Explanations!
• Non-stationary base composition!• Selection for translational efficiency!• Biased gene conversion!• Selection upon base composition!
-
Selection on GC content!
€
H(x) = J(x)J(x)dx0
1∫
€
J(x) = eSx xV −1(1− x)U −1
€
S = 2Nes U = 2Neµ(1− f ) V = 2Neµf f = v /(u + v)
GC AT uµ
vµ +s -s
-
Selection on GC content!
2 4 6 8 10
0.6
0.7
0.8
0.9
1.0Zpred
S
-
Selection on GC4!
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
Zobs
Zpred
-
Selection on GC4!
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
Zobs
Zpred
f = α + β GC4
f = 0.2 + 0.35 GC4
-
Selection on GC4!f = α + β GC4
f = 0.2 + 0.35 GC4
0.2 0.4 0.6 0.8 1.0
0.25
0.30
0.35
0.40
0.45
0.50
0.55
f
GC4
-
Selection on genomic GC!
Genomic GC
GC4!
-
Correlates!
• Genome size!• positive correlation!
• Lifestyle!• higher GC in free living!
• Aerobiosis!• higher GC in aerobic!
• Nitrogen utilization!• higher GC amongst N fixers!
• Temperature !• higher amongst thermophiles?!
-
Environmental meta-genomics!
Foerstner et al. (2005) EMBO Reports
-
Environmental meta-genomics!
-
Thanks!
Falk Hildebrand Axel Meyer
-
Mitochondrial DNA!
• 488 animal datasets!
Group! Percentage!Mammals! 30!
Birds! 12!Fish! 23!
Insects! 16!Crustacea! 7!Molluscs! 12!
-
Mitochondrial DNA!
0.0 0.2 0.4 0.6 0.8 1.0
1
2
3
4
GC content
-
Z versus GC4!
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
Z
GC content
-
Z-Zpred versus GC4!
0.1 0.2 0.3 0.4 0.5 0.6
0.4
0.2
0.0
0.2
0.4
Z-Zpred
GC4
r=0.11, p=0.02
-
Evidence of selection II!
• Phylogenetic analyses!• Mycobacterium leprae (Lynch 2007)!• Escherichia coli (Balbi et al. 2009)!• (5 pathogenic bacteria (Hershberg and
Petrov 2010))!• Excess of GC AT!