comparative biology with focus on 8 examples comparative biology the domain of comparative biology...
Post on 20-Dec-2015
233 views
TRANSCRIPT
Comparative Biology with focus on 8 examples•Comparative Biology
•The Domain of Comparative Biology
•The purpose of Comparative Biology
•Co-modeling in Comparative Biology
•Examples of Stochastic Comparative Modeling
•Shape Evolution
•Protein Structure Evolution
•Movement Evolution
•RNA Secondary Structure Evolution
•Genome Structure Evolution
•Gene Frequencies in Populations
•Pattern Evolution
•Stemmatology: Manuscript Evolution
Comparative Biology
observable observable
Parameters:tim
e
rates, selection
Unobservable
Evolutionary Path
observable
Most Recent
Common Ancestor
?
ATTGCGTATATAT….CAG ATTGCGTATATAT….CAG ATTGCGTATATAT….CAG
Tim
e Direction
•Which phylogeny?
•Which ancestral states?
•Which process?
Key Questions:•Homologous objects•Co-modelling•Genealogical Structures?
Key Generalisations:
Comparative Biology: Evolutionary Models
Nucleotides/Amino Acids/codons CTFS continuous time finite states Jukes-Cantor 69 +500 othersContinuous Quantities CTNS continuous time continuous states Felsenstein 68 + 50 othersSequences CTUS continuous time countable states Thorne, Kishino Felsenstein,91 + 40othersGene Structure Matching DeGroot, 07Genome Structure CTCS MM Miklos,Population Brownian Motion/Diffusion Fisher, Wright, Haldane, Kimura, ….Structure RNA SCFG-model like Holmes, I. 06 + few others Protein non-evolutionary: extreme variety Lesk, A;Taylor, W.Networks CTCS Snijder, T (sociological networks) Metabolic Pathways CTFS Mithani, 2009a,b Protein Interaction CTCS Stumpf, Wiuf, Ideker Regulatory Pathways CTCS Quayle and Bullock, 06, Teichmann Signal Transduction CTCS Soyer et al.,06 Macromolecular Assemblies ?Motors ?Shape - (non-evolutionary models) Dryden and Mardia, 1998, Bookstein, Patterns - (non-evolutionary models) Turing, 52; Tissue/Organs/Skeleton/…. - (non-evolutionary models) Grenander, Dynamics MD movements of proteins - Biggins 05, Munz 10, Locomotion -Culture analogues to genetic models Cavalli-Sforza & Feldman, 83Manuscripts (stemmatology) analogous to sequence models Chris J Howe, http://www.cs.helsinki.fi/u/ttonteri/casc/
Language Vocabulary “Infinite Allele Model” (CTCS) Swadesh,52, Sankoff,72, Gray & Aitkinson, 2003
Grammar Dunn 05 Phonetics Bouchard-Côté 2007 Semantics Sankoff,70 Phenotype Brownian Motion/DiffusionDynamical Systems -
Object Type Reference
Co-Modelling and Conditional Modelling
Observable
Observable Unobservable
Unobservable
Goldman, Thorne & Jones, 96
UC G
AC
AU
AC
Knudsen.., 99
Eddy & co.
Meyer and Durbin 02 Pedersen …, 03 Siepel & Haussler 03
Pedersen, Meyer, Forsberg…, Simmonds 2004a,b
McCauley ….
Firth & Brown
i. P(Sequence Structure)
ii. P(Structure)
)()(
)()(
SequencePSequenceStructureP
StructurePStructureSequenceP
• Conditional Modelling
Needs:Footprinting -Signals (Blanchette)
AGGTATATAATGCG..... Pcoding{ATG-->GTG} orAGCCATTTAGTGCG..... Pnon-coding{ATG-->GTG}
The Purpose of Comparative Biology
• Primarily due to lack of data
• Secondarily due to lack of models
• Make realistic model (pass goodness-of-fit (GOF) test)• Estimate Parameters• Make statements about the path of evolution – ancestral analysis
• Co-Evolution of different components within a level
• Rate of Evolution
• Heterogeneity
Time
State Space
• Selection
• Dependence among different levels (co-modelling)
To describe evolution:
Biological Questions:
Most of these questions have not been addressed beyond the sequence level:
Analyse homologous pairs or sets• What is the equilibrium distribution• Integrate over histories
Population Gene Frequencies
Xt is a diffusion with (x)=0 and (x)=x(1-x)
E. T
hompson (1975) H
uman E
volutionary Trees C
UP
Reaction Coefficients:
• Continuous Time Continuous States Markov Process - specifically Diffusion.
• For instance Ornstein-Uhlenbeck, which has Gausssian equilibrium distribution
Genome Structure Evolution
1 2 3 k
12 3k
• Evolutionary events:
• Extensions:
• Directions of Genes Unknown
• A set of chromosomes related by a phylogeny
Duplication1
1 1
Inversion1 2 3
1 2 3
TranspositionDeletion
1 3
1 2 3
• Inference Principles
• Shortest Path (Parsimony)
• Sum over paths with probabilities (ML)
Genome Structure Evolution
• Full graph for 5 genes
• Genomic reconstruction for human, mouse and rat.
Stemmatology: Evolution of ManuscriptsP
hylogenetics of Medieval M
anuscripts by Christopher H
owe
Ashmole 59 Buryed at Caane thus seythe the Croniculer
Digby 186 Beryed att Cane & thus says the cronyclere
BL Ad 31042 Beryed at caene so seyth the cronyclere
Lansd. 762 Buried at cane this saith the croneclere
de Worde And is buried at Cane as the Cronycle sayes
R. Wyer And buryed at cane as the Cronycle sayes
Phylogeny of “Canterbury Tales”:
Howe et al ,2001
RNA Structure Evolution
Tree Representations of RNA Structure
How
Do R
NA
Folding A
lgorithms W
ork?. S.R
. Eddy. N
ature Biotechnology, 22:1457-1458, 2004.
Average com
plexity of the Jiang-Wang-Z
hang pairwise tree alignm
ent algorithm and of a R
NA
secondary structure alignment algorithm
Claire
Herrbach, A
lain Denise and S
erge Dulucq
A Tree Distance Pairwise Edit AlgorithmBasic Edit Operations
Known KnownUnknown
-globin Myoglobin
300 amino acid changes800 nucleotide changes1 structural change1.4 Gyr
?
?
?
?
1. Given Structure what are the possible events that could happen?
2. What are their probabilities? Old fashioned substitution + indel process with bias.
Bias: Folding(Sequence Structure) & Fitness of Structure
3. Summation over all paths.
Protein Structure Evolution
Trajectories between two Secondary Structures
HQYWYWLLATIVVAWMCMHSGHPPMCWFFWFLLIVICFYYRKKNQEDDNERPMTSG
QYYWWWFCTNSPPHYHRQDEEDNKRRKLWWAFFCCVFIIAILLMVAGSTGVMMLMP
1D Structure
3D Structure
2D Structure
S1
S2
Sn
Sk
1 structure
Set of sequences
S3
• Space of Protein Structures is large and complicated – both continuous and discrete
• Approximated by a series of stepping stones and a continuous time markov chain
• Observation: two structures with sequence and secondary structure information
The Evolution/Comparison of Molecular Movements
Molecular Movements of Homologous Proteins are themselves homologous
The full problem: 2 times 1000 atoms observed at 106 time points.
Reductions:
ii. Only correlated pairwise movements 1 dimensional summary for each aa pair
i. only a-carbons 100 space points
Dynamic Fingerprint Matrix (DFM)
The Evolution/Comparison of Molecular Movements
The Phylogenetic Turing Patterns Ihttp://w
ww
.stats.ox.ac.uk/__data/assets/file/0015/3327/brooks.pdf
Stripes: p small Spots: p large
The Phylogenetic Turing Patterns II
Reaction-Diffusion Equations:
Analysis Tasks:1. Choose Class of Mechanisms2. Observe Empirical Patterns
3. Choose Closest set of Turing Patterns T1, T2,.., Tk,
4. Choose parameters p1, p2, .. , pk (sets?) behind T1,..
Evolutionary Modelling Tasks:
1. p(t1)-p(t2) ~ N(0, (t1-t2)) 2. Non-overlapping intervals have independent incrementsI.e. Brownian Motion
Scientific Motivation:1. Is there evolutionary information on pattern mechanisms?2. How does patterns evolve?
Shapes and Shape EvolutionG
un
z (2009) Early m
od
ern h
um
an d
iversity sug
gests su
bd
ivided
po
pu
lation
structu
re and
a com
plex o
ut-o
f-Africa scen
arioC
omparison of cranial ontogenetic trajectories am
ong great apes and humans P
hilipp Mitteroeckera*,
Evolutionary M
orphing David F
. Wiley http://graphics.idav.ucdavis.edu/research/projects/E
voMorph
• Landmarks
• Semilandmarks
Summary•Comparative Biology
•The Domain of Comparative Biology
•The purpose of Comparative Biology
•Co-modeling in Comparative Biology
•Examples of Stochastic Comparative Modeling
•Shape Evolution
•Protein Structure Evolution
•Movement Evolution
•RNA Secondary Structure Evolution
•Genome Structure Evolution
•Gene Frequencies in Populations
•Pattern Evolution
•Stemmatology: Manuscript Evolution