how do replication and transcription change genomes?
DESCRIPTION
How do Replication and Transcription Change Genomes?. Andrey Grigoriev Director, Center for Computational and Integrative Biology Rutgers University. What are we going to do?. Observe effects of fundamental processes Estimate their relative contribution Link them to genome features - PowerPoint PPT PresentationTRANSCRIPT
How do Replication and Transcription Change Genomes?
Andrey Grigoriev
Director, Center for Computational and Integrative Biology Rutgers University
2
What are we going to do?
• Observe effects of fundamental processes• Estimate their relative contribution• Link them to genome features
• Analyze nucleotide composition
How do Replication and Transcription Change Genomes?
Well, do they?
4
Replication and Transcription
• textbook view
faithful reproduction machinery
• basis for selection
parental DNA fitness advantages
5
Replication and Transcription
• paradox
both systematically change genomes
which they faithfully reproduce
• and they leave traces
6
What is in the sequence?
• The usual – coding, regulatory regions, exons, introns,
RNAs, etc.
• Biases in nucleotide composition– Traces of organism‘s „lifestyle“– Links to genome features
7
Counting nucleotides: GC Skew
sw = ([G]-[C])/([G]+[C])
• Short sequence interval (window) w• Relative excess of G vs C [-1;1]
• Plot vs % of genome position [0;100]
8
0 20 40 60 80 100
0 20 40 60 80 100
position, % genome length
Simian virus 40
Haemophilis influenzae
9
Cumulative Skew Diagrams
sw = ([G]-[C])/([G]+[C])
S = W sw w/L
For W adjacent windows of size w << L
S is an integral of skew function
10
0 20 40 60 80 100
0 20 40 60 80 100
position, % genome length
Simian virus 40
replication origin (ori)
replication terminus (ter)
11
0 20 40 60 80 100
0 20 40 60 80 100
position, % genome length
Haemophilis influenzae
replication origin (ori)
replication terminus (ter)
12
Genome of Escherichia coli
position, % genome length
0 20 40 60 80 100
Terminus
Origin
13
Genome of Bacillus subtilis
0 20 40 60 80 100
position, % genome length
14
Genome of Borellia burgdorferi
position, % genome length
0 20 40 60 80 100
15
Cumulative Skew Diagrams
• Now widely used to predict ori and ter in novel and less studied microbial genomes
• Predictions confirmed experimentally
• Constant skews over half-genomes• oriter G>C terori G<C• Strand properties change at ori and ter
16
Causes: Selection vs. Mutation
• Properties of encoded proteins• Regulatory sequences
• Most pronounced in 3rd codon position• Suggests mutation, not selection pressure
17
DNA single-stranded, not protected
continuous DNA synthesis
discontinuous DNA synthesismRNA synthesis
template DNA
Transcription Replication
18
Most Consistent Explanation
• spontaneous deamination of C or 5-MetC
– by far the most frequent mutation (rates raise over 100-fold when DNA is single-stranded)
– fixing the mutated base during the next round of replication
– depletion of cytosines vs guanines
19
Cytosine Deamination
Cytosine
Uracil
Thymine
20
Replication
• Leading strand exposed in replication bubble, generation after generation
• Unusual replication models consistent with the single-strand hypothesis– adenovirus– mitochondria
21
0 20 40 60 80 100
Series1Poly. (Series1)
position, % genome length
Adenovirus Replicationorigins
22
Replication or Transcription
• Leading-lagging switch at ori and ter• Consistent with replication models
• Transcription often colinear with replication• Direction often changes at ori and ter
23
0 20 40 60 80 100
position, % genome length
Replication vs. Transcription
HPV-16
24
Replication vs. Transcription
• Comparable contribution to skew
• [G]=900, [C]=690 in the same direction
additive effect on skew• [G]=758, [C]=773 in the opposite direction
cancel each other out
25
Genome of Bacillus subtilis
0 20 40 60 80 100
position, % genome length
26
Diagrams „jagged“
• Sequence constraints – amino acid composition, regulatory sequences,
etc.
• Sequence inversions – swaps strands and change the skew to its
opposite between the borders of the inversion
• Horizontal transfer between species
27
5‘ 3‘
A B C D A C B D
3‘5‘
Inversion
28
Rearrangements in two sequenced strains of Helicobacter pylori
Colored areas under the curve correspond to inversions and translocations
cagPAI – pathogenicity island (likely horizontal transfer)
29
Conclusions
• Analyze nucleotide composition• Observe effects of fundamental processes• Link them to genome features• Estimate their relative contribution
• Start asking own questions