phylogenetic models and mcmc methods for the reconstruction of language history
TRANSCRIPT
![Page 1: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/1.jpg)
Phylogenetic models and MCMC methods for the reconstruction of language history
Robin J. RyderCEREMADE – Paris Dauphine / CREST – INSEE
Joint work with Geoff K. Nichollsat the Department of Statistics, University of Oxford
www.slideshare.net/robinryder
![Page 2: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/2.jpg)
Carles li reis, nostre emper[er]e magnesSet anz tuz pleins ad estet en Espaigne :Tresqu’en la mer cunquist la tere altaigne.N’i ad castel ki devant lui remaigne ;Mur ne citet n’i est remes a fraindre,Fors Sarraguce, ki est en une muntaigne.
Chanson de Roland, 1r (11th century)
![Page 3: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/3.jpg)
La plus commune façon d'amollir les coeurs de ceux qu'on a offensez, lors qu'ayant la vengeance en main, ils nous tiennent à leur mercy, c'est de les esmouvoir par submission à commiseration et à pitié.
Montaigne, Essais, I, 1 (1580)
![Page 4: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/4.jpg)
Tes yeux sont si profonds qu'en me penchant pour boireJ'ai vu tous les soleils y venir se mirerS'y jeter à mourir tous les désespérésTes yeux sont si profonds que j'y perds la mémoire
Aragon, Les Yeux d'Elsa (1942)
![Page 5: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/5.jpg)
Et la piaule swingue au son du ghetto, on tape à la porteChill c'est trop fort ! baisse le son merde ! j'connaisA chaque fois c'est pareil tant pis il faut qu'ça pèteEt profite en traître des nouveaux albums qu'Rod m'achète
Akhénaton, Juste une pression (2005)
![Page 6: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/6.jpg)
What to expect
Description of the data
Model of language diversification
MCMC for phylogenetic trees
Synthetic studies
Analysis of two data sets
![Page 7: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/7.jpg)
Indo-European languages
![Page 8: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/8.jpg)
Indo-European languages
![Page 9: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/9.jpg)
Language diversification
Languages change in a way comparable to biological species
Similarities between languages indicate that they may be cousins.
Most common model : phylogenetic tree
![Page 10: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/10.jpg)
![Page 11: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/11.jpg)
Questions
Topology
Internal ages
Age of the root: 6000-6500 BP or 8000-9500 BP?
(BP=Before Present)
![Page 12: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/12.jpg)
Core vocabulary
100 or 200 meanings, present in almost all languages : bird, hand, to eat, red...
Borrowing is possible (non-tree-like change), but:
“Easy” to detect
Uncommon
Does not introduce systematic bias
![Page 13: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/13.jpg)
Data coding
Old English: stierfþ
Old High German: stirbit, touwit
Avestan: miriiete
Old Church Slavonic: umĭretŭ
Latin: moritur
Oscan: ?
Cognacy classes:
1. {stierfþ, stirbit}
2. {touwit}
3. {miriiete, umĭretŭ, moritur}
![Page 14: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/14.jpg)
Constraints
Constraints on parts of the topology
Constraints on some internal ages
We use these constraints to infer rates and other ages
![Page 15: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/15.jpg)
![Page 16: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/16.jpg)
Description of the model (1)
Traits are born at rate λ
Trait instances die at rate μ
λ and μ are constants
![Page 17: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/17.jpg)
Description of the model (2)
Catastrophes occur at rate ρ
At a catastrophe, each trait dies with probability κ and Poiss(ν) traits are born.
λ/μ=ν/κ: the number of traits is constant on average.
![Page 18: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/18.jpg)
Description of the model (3)
Observation model: each data point (0s and 1s) is missing with probability ξ
Some traits are not observed and are therefore deleted from the data
![Page 19: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/19.jpg)
Registration process
![Page 20: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/20.jpg)
Registration process
![Page 21: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/21.jpg)
Registration process
![Page 22: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/22.jpg)
Registration process
![Page 23: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/23.jpg)
Posterior distribution
![Page 24: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/24.jpg)
Likelihood calculations
![Page 25: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/25.jpg)
Prior distribution on trees
Our main focus is on the root age
We would like the marginal prior on the root age to be (approximately) uniform over (say) 5000-15000BP
![Page 26: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/26.jpg)
MCMC moves
Random walk on the parameters
Various moves on the tree (Drummond et al., 2002)
![Page 27: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/27.jpg)
![Page 28: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/28.jpg)
![Page 29: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/29.jpg)
![Page 30: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/30.jpg)
![Page 31: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/31.jpg)
![Page 32: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/32.jpg)
![Page 33: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/33.jpg)
![Page 34: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/34.jpg)
![Page 35: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/35.jpg)
![Page 36: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/36.jpg)
![Page 37: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/37.jpg)
![Page 38: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/38.jpg)
![Page 39: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/39.jpg)
![Page 40: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/40.jpg)
![Page 41: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/41.jpg)
![Page 42: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/42.jpg)
![Page 43: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/43.jpg)
![Page 44: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/44.jpg)
![Page 45: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/45.jpg)
![Page 46: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/46.jpg)
Checking mixing and convergence
Auto-correlations
Need statistics on the tree
Length of the tree
Root age
Presence/Absence of a few subtrees
![Page 47: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/47.jpg)
Synthetic data
True tree, ~40 words/language Consensus tree
![Page 48: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/48.jpg)
Synthetic data (2)
Death rate (μ)
![Page 49: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/49.jpg)
Influence of borrowing
True tree, ~40 words/languageBorrowing: 10%
Consensus tree
![Page 50: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/50.jpg)
Influence of borrowing (2)
Consensus treeTrue tree, ~40 words/languageBorrowing: 50%
![Page 51: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/51.jpg)
Influence of borrowing (3)
Topology is reconstructed correctly
Dates are underestimated for high levels of borrowing
Root age Death rate (μ)
Borrowing: 50%
![Page 52: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/52.jpg)
Detecting borrowing
Confirmed: hardly any borrowing!
![Page 53: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/53.jpg)
Data used
Indo-European languages
Core vocabulary (Swadesh 100 or 200)
Two independent data sets
Dyen et al. (1997): 87 languages, mostly modern
Ringe et al. (2002): 24 languages, mostly ancient
![Page 54: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/54.jpg)
Constraints
![Page 55: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/55.jpg)
Cross-validation
![Page 56: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/56.jpg)
![Page 57: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/57.jpg)
![Page 58: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/58.jpg)
![Page 59: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/59.jpg)
![Page 60: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/60.jpg)
![Page 61: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/61.jpg)
![Page 62: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/62.jpg)
![Page 63: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/63.jpg)
![Page 64: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/64.jpg)
![Page 65: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/65.jpg)
![Page 66: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/66.jpg)
![Page 67: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/67.jpg)
![Page 68: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/68.jpg)
Root age
![Page 69: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/69.jpg)
Conclusions
Strong support for the Anatolian hypothesis: root age around 8000BP. No support for the Kurgan hypothesis.
Applicable to a variety of linguistic and cultural data sets
TraitLab: it's free!
![Page 70: Phylogenetic models and MCMC methods for the reconstruction of language history](https://reader034.vdocuments.site/reader034/viewer/2022052700/55a04f3e1a28abad578b45cd/html5/thumbnails/70.jpg)
Questions
otázky
spørgsmåler
vragen
questions
Fragen
domande
pytania
questões
întrebări
вопросы
vprašanja
preguntespreguntas
frågor
vrae
spurningar
quaestiones
ερωτήσεις
въпроси
kesses
spørsmåler
kláusimai
запитанні
سوال
पशcwestiwnau