human evolution timelines source: jobling, hurles & tyler-smith (2004) human evolutionary...
TRANSCRIPT
Human Evolution Timelines
Source: Jobling, Hurles & Tyler-Smith (2004) Human Evolutionary Genetics.
Research History Evolution History
The Data
Genetic: Allele Frequencies, SNPs, Haplotypes
Non-genetic: Language, culture, pets,pathogens, culture,..
The Dynamics
Mutation, selection, recombination,
The Genealogical Structure
Phylogeny, Ancestral Recombination Graph, Pedigree
Human Evolution
Relationship to the great Apes, Ancestral Population of Human/Chimp Ancestor, Out of Africa Ancestral Population Structure, Selection, Migrations & Age of Alleles.
Genealogies
Iceland
Models of Pedigrees
Languages & Pathogens
Populations & Basic Genealogical Structures
Now
Parents
Grand parents
Pedigree: Trace the ancestry of individuals
Phylogeny: Trace the ancestry of sequence points.
ARG: Trace the ancestry of sequences
Other Genealogical Structures are possible network, language merging, population splitting
Recombination
1 meiosis
Lander et al.(2001) “Initial sequencing and analysis of the human genome” Nature 409.860-912. + Kong,E. et al.(2002) “A high resolution recombination map of the human genome” Nature Genetics
Recombination:Gene Conversion:
•Total Haploid length males: 25.9 M - females: 44.6 M.
•Gene conversions 1-2 orders higher. Length 300-2000 pb.
Mutations and Mutation Rates
1 mitosis or generation
Average Number of Mitoses
Per Male generation (15:35 .. 20:150)
Per Female generation: ~24
Crow,JF (2000) “The Origins, Patterns and Implications of Human Spontaneous Mutation” Nature Review Genetics 1.1.40-47 + Strachan and Read (2004) chapter 11 +Jobling, Hurles and Tyler-Smith (2004) chapter 2
• Single nucleotide substitutions: ~10-7
• Microsatellites (~100.000): ~10-2
• Small insertion deletions: ~10-8
A A A C C A A A C C A A A C C
Selection: Positive & Negative
Coalescent Issues1. The number of genetic ancestors
2. When gene-trees differ from species trees
3. Out of Africa
4. Ages of Alleles
5. Allele Gradients
6. Number of Genetic Ancestors
7. Selective Sweeps
Human History Levels: Physical, Cultural & Genealogical
The physical population size, N(t), and the efficient population size, Ne(t) are separate concepts.
i. N(t)can mainly be studied by historical/archeological means,
ii. Ne(t) can be studied genealogically, for instance by tracing the ancestries of DNA sequences.
Main departures from simplest Population Genetical Models:
A. Long epochs of exponential growth at increasing rates
B. Bottlenecks & small populations.
C. Migrations & Geographical subdivisions
Our relationship to the great Apes.From Nei,2003
HumansChimp Pygmee Chimp Gorilla Orangutan
13 Myr
7 Myr
5.5 Myr
1 Myr
Ancestral Population of Human and Chimp
Human Chimp GorillaNow
5 Myr
7 Myr
H C
G eNteSpecieTreeGeneTreeP 2/3/2)( −=≠
Example: Chen & Li (2001) 53 triads: 31 (H,C), 10 (H,G) & 11 (C,G)
Out-of-Africa and different degrees of replacements
AsiaAfricaEurope AsiaAfricaEurope AsiaAfricaEurope
Total replacement No replacement Partial replacement
80-130 Kyr 80-130 Kyr 80-130 Kyr
1-1.2 Myr1-1.2 Myr1-1.2 Myr
Example: Takahata (2001) found data could be explained by total replacement.
Allele Frequencies and Principal ComponentsCavalli-Sforza,2001
1. Agriculture 6-10 Kyr
2. Greek Colonisation 3 Kyr
3. Retraction of the Basques
4. Uralic People
5. Horse domestication
•Allele frequencies for different localities are subjected to a smoothing procedure.
•Principle Components are found and projected on geographical maps.
•Strongly criticized (Sokal et al.): even no geographical structure will “look like” geographical structure, no timing of gradients,...
Time slices
Population
N1
1 2 1 2 1 2 1 2 1 2
Tim
e
All positions have found a common ancestors
All positions have found a common ancestors on one sequence
S– number of Segments E(S) = 1 +
Number of genetic ancestors to the Human Genome
sequence
time
R
R
R
C
C
C
Statements about number of ancestors are much harder to make.
Wiuf conjectured ~/ln()
Simulations
A randomly picked ancestor: (ancestral material comes in batteries!)
0
0 52.000
260 Mb
06890 8360
7.5 Mb
*35
0 30kb
*250
Parameters used 4Ne 20.000 Chromos. 1: 263 Mb. 263 cM
Chromosome 1: Segments 52.000 Ancestors 6.800
All chromosomes Ancestors 86.000Physical Population. 1.3-5.0 Mill.
Applications to Human Genome (Wiuf and Hein,97)
Many sampled alleles relative to NeWakeley03, Pitman, Schweinberg
1. Simultaneous Events 2. Multifurcations.3. Underestimation of Coalescent Rates
Cystic Fibrosis(Wiuf 2000)
F508 – possibly maintained by heterosis (1.023)- higher resistance to Salmonella infections.
Data: 1. Frequency of F508-allele - .022.2. Inter variability in 1.705 individuals 46 variable positions.3. Model of human demography.
Model parameters: mutation rate, heterosis advantage and an exponential growth model of human population expansion.
Estimated age of F508 is estimated to be
*
Pedigree Issues
i. Icelandic Pedigree
ii. Theoretical Models
Icelandichttp://www.decode.com + Helgason, A. et al. (2003 June) “A population-wide coalescent analysis of Icelandic matrilineal and patrilineal genealogies: Evidence for a faster evolutionary rate of mtDNA lineages than Y-chromosomes” American Journal Human Genetics.
Chinesehttp://demography.anu.edu.au/People/Staff/zhongwei.html
Mormonshttp://genealogy-mormons.com/
Burke’s British Peeragehttp://www.burkes-peerage.net/sites/wars/sitepages/home.asp
Quebec FrenchHeyer and Tremblay, 1998 PNAS
1972
2002
1848
1892
Year
2
2
3
1
1 1 11
1
1
2
2
2
Ancestor cohort
Contemporary cohort
Icelandic Genealogies Helgason, 2003
Total Genealogy
Males onlyFemales only
Of (June 2002) 276,00 Icelanders 131,060 born after 1972 was traced back.
77.9%
22.1%
N = 31,817 N = 31,659
N = 66,910N = 64,150
8.3%
91.7% 86.2%
13.8%
73.9%
26.1%
g = 4.3
g = 3.8
Ancestral cohort born 1848-1892
Descendant cohortborn after 1972
Matrilines Patrilines
Ancestral cohort born 1698-1742
Descendant cohortborn after 1972
PatrilinesN = 18,023
10.3%
89.7%
N = 66,910
29.3%
70.7%
g = 7.9
g = 8.8
N = 20,443
Matrilines
6.6%
93.4%
61.8%
38.2%
N = 64,150
Icelandic Genealogies Helgason, 2003
Backtracable
Ancestors to 1972 cohort
25 50 75 100
No. of descendants
5%
10%
15%
20%
25%
Percent of ancestors
25 50 75 100
No. of descendants
Matrilines Patrilines
25 50 75 100 125 150 175 200 225 250 275 300 325 350 375
No. of descendants
2.5%
5.0%
7.5%
10.0%
12.5%
15.0%
Percent of ancestors
25 50 75 100 125 150 175 200 225 250 275 300 325 350 375
No. of descendants
Matrilines Patrilines
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1700 1750 1800 1850 1900 1950 2000
Birth year of parent
Average number of offspring Patrilines
Matrilines
20
25
30
35
40
45
1700 1750 1800 1850 1900 1950 2000
Birth year of individual
Age of parent (years)
Patrilines
MatrilinesIcelandic
Genealogies Helgason, 2003
Variation in annual offspring number greater for females in males, due to shorter generation time.
Positive correlation in fertility between parent-offspring.
Finding Ancestral Individuals.Joe Chang 1999 Dec. Adv. Appl. Prob.
Fin
din
g (
Gre
at)k
Gra
nd
Par
ents
.
Let T be the time, when somebody was everybody’s ancestor. Changs’ result: lim T*/log2(N) =1 prob. 1
NOW0
1
2
3
4
5
6
7
8
9
10
11
Combining Ancestral Individuals and the Coalescent Wiuf & Hein, 2000.
Unify the two processes:
Sample more individuals
Let each have p parents. ( p – possibly stochastic >= 1).
Result: A discontinuity at 1.
For p>1 change log2logp
Comment: Genetic Ancestors is a vanishing set within Genealogical Ancestors.
Fin
din
g C
om
mo
n A
nce
sto
rs.
NOW
Derrida
∑=+γγ
αγ
αγ
of children '' )(
2
1)1( GwGwRecursion:
Initialization:
α individual, γ ancestor in tree, w - weight probability that uni. random path leads to γ.
G
G +1
γααγ δ ,)0( =w
number. offspring expected is
. ,marriagea for ion DistributOffspring
m
pk
Kammerle 89: Pair Moran Model
A pair of children are born – they choose parents randomly.
A pair is erased and the children pair take their place.
A. The stationary distribution of number of ancestors to present population is hypergeometric:
⎟⎟⎠
⎞⎜⎜⎝
⎛+
⎟⎟⎠
⎞⎜⎜⎝
⎛−⎟⎟
⎠
⎞⎜⎜⎝
⎛
=
1
21
N
Ni
N
i
N
iα
)8/1,0(N
N/2-R then)( B.
N
,..1 NR NiN >−−= =α
-y.(y)drift malinfinitesi and 1/4 (y) variancemalinfinitesi withprocess
Uhlenbeck -Ornstein toconverges (t)S ).(ˆ:(t)S and 2/)(
:)(ˆ C. NN
==
=−
=
μσ
tRNNtR
tR NN
0
y
Non-Contributing AncestorsKevin Donnelly, 1983 TPB
….1 22 y
x….
1 22 y
x
No Recombination: Recombination:
0 1
2k
211
k
Gen
eration
:
An
cestors:
1 22 y1 22 x
46 packets
46 packets
46 packets
46 packets
<≈72 + 46 packets
< ≈k*72 + 46 packets
….1 22 y
x….
1 22
x
x
Non-Contributing AncestorsYun song- pers.comm., 2003 Kevin Donnelly, 1983
Bac
k in
Tim
e
1
2
4
8
16
32
64
128
256
512
Fro
m Y
un
So
ng
The probability of
1. Any non-contributing ancestor
2.That a randomly chosen ancestors is non-contributing
Pedigree Inference
Three Processes
1. Choosing Parents
2. Recombination
3. The Mutational Process
Fro
m Y
un
So
ng
Prior on Pedigrees
Mother Father
Posterior on Pedigrees
Elston-Stewart (1971) -Temporal Peeling Algorithm
Lander-Green (1987) - Genotype Scanning Algorithm
Probability of data given pedigree
Inheritable phenomena
Genetic Material
Sequences
“Allele Frequencies”
Language
Culture
Pathogens
Pests
Pets
Morphological Characters
Pathogen phylogenies Falush 2003
Helicobacter pylori is transmitted from mother to child.
Falush et al. sequenced 8 genes from 370 strains from 27 populations – 3850 nucletides each.
5 ancestral populations:East Asia, Euro1, Euro2, Afr1 Afr2
Structure assign each polymorphism to an ancestral population.
American indians are grouped as asian showing that H.pylori infection is ancient.
Diversity of H.pylori 50 times larger than humans.
Much recombination – i.e. positions can be treated as independent
A. Maori is east asian.
B. Inuit is Euro1 + Euro2
C. South African Afr2
D. English
Cavalli- Sforza: Language TreesCavalli-Sforza (1997) Genes Peoples and Languages PNAS 94.7719-24
Principle of Comparison.
Loss of cognates (“homologous” words)
Syntax Comparison.
Sound use.
Reconstruction (dependent on interpretation) – stretches back 2-6.000 years dependent on criteria.
Historical LinguisticsWilliam Jones 1776 observes similarities between Sanskrit, Greek & Latin
Swadesh (1952) makes on of the first glottochronological studies
Kruskal, Dyer & Black (1971) large successful investigation.
Principles:
Distance - Swadesh’ rule. 20% lost per millenium.
Parsimony
Compatibility
Likelihood
Criticisms:
Word Loss is not clocklike
Languages and merge and borrow giving non-tree like structure
Not much research goes into this area.
Home sapiens sapiens 100-70 Kyr
AfricanKhoisan
Niger-Kordofanian
Nilo-Saharan
Afro-Asiatic
Kartvelian
Dravidian
Indo-European
Uralic
Altaic
Eskimo-Aleut
Chukchi-Kamchatkan
Amerind
Na-Dene
Sino-Tibetan
Caucasian/Basque/Burushaski
Austronesian
Daic
Miao-Yao
Austro-Asiatic
Indo-Pacific
Australian
Pacific
Austric
Asian 70-50 Kyr
Eurasian 60-40 Kyr
Austro-Tai
Congo-Saharan
Eurasiatic 20-10 KyrEurasion/American 40-20 Kyr
Dene-Caucasian 40-20 Kyr
Dene-Caucasian 40-20 Kyr
Global PhylogenyCavalli-Sforza,2001
Ruhlen, 1994
Afghan
Baluchi
Persian
Osetic
Bengali
Hindi
Punjabi
Marathi
Nepali
Kashmiri
Singhalese
Breton
Welch
Irish
Bulgarian
Macedonian
Serbo Croatian
Belorussian
Ukranian
Polish
Chech
Russian
Slovenian
Latvian
Lithuanian
French
Walloon
Italian
Ladin
Portugese
Spanish
Sardinian
Rumanian
Danish
Swedish
Riksmaal
Faraoese
Icelandic
Dutch
German
Frisian
English
Greek
Armenian
Albanian
Germ
anic
Celtic
Ro
man
ceS
lavic
600070009000 8000 5000 4000 3000 2000 1000
Indo-European Language Trees Dyen, Kruskal & Black, 1992
Piazza, Cavalli-Sforza, 2001
Germanic Language TreesFrom Embleton, 1986
English
Pennsylvanian German
Dutch
Yiddish
German
Hamburg Lower Saxony
Danish
Swedish
Icelandic
Norwegian
Faraoese
Tok Pisin
Africaans
Frisian
Flanders
TrS
1803
1540
1051
1842
246
1941051
1239
1476
1558
1051
1423 1668
12341025
The Coalescent & Human Evolution (11.6.04)
Human History
Methodological Problems: Reconstucting haplotypes, defining haplotype blocks + HapMap.
Relationship to the great Apes, Ancestral Population of Human/Chimp Ancestor, Out of Africa, The Neanderthal.
Human Population Growth, Ancestral Population Structure, Selection, Migrations & Age of Alleles.
SNPs Haplotypes, Recombination Hotspots & Haplotype Blocks.
Individual Stories: Mitochondria, Y, autosomal chromosomes & alleles.
Emperical Genealogies
Iceland
Other Genealogical Issues: Genealogical Ancestors, Genetic and Non-Contributing Ancestors
Heritable Characters
Languages
Associated Animals, Plants & Pathogens
Surnames
Morphological Characters
The Role of Coalescent Theory