the paternal tree of humanity

44
The paternal tree of humanity Doron M Behar, MD, PhD Family Tree DNA, Houston, Texas The 12th Genetic Genealogy Conference for Family Tree DNA Group Administrators November 11-13, 2016

Upload: family-tree-dna

Post on 12-Apr-2017

781 views

Category:

Science


1 download

TRANSCRIPT

SOD

The paternal tree of humanityDoron M Behar, MD, PhD

Family Tree DNA, Houston, Texas

The 12th Genetic Genealogy Conference for Family Tree DNA Group Administrators November 11-13, 2016

1

We are establishing it!

Together!!!2

One for all and all for One

3

It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50-100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192-307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47-52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.4

The HorowitzsMigration path ~1400 A.C.Genealogy from 1450 A.C.YESHAYA HOROVSKY ISH HOROWICE

5

5 individuals

Horowitz genealogyDocumentedMolecular

6

Written vs molecular genealogyWrittenMolecular

7

1450 = 566 ybp1615 = 401 ybp546 ybp399 ybp

This is what you need, Right?!

Good, cause we are building it!8

Whole Y Chromosome60M bps longKarmin et al:We exclude all oF Chr Y outside 10.8-Mb sequence>5x unique coverageFTDNA:Around 11.5 to 12.5 million base-pairs of reliably mapped positions of non-recombining Y chromosome9

Haplogroup Q

10

Koryaks10

A fraction of the tree

11

162 variants = ~35,000ybpQ3

Count of mutations

12

Actual positionsPositionNew_HGMarkerISOGG name2692142Q3M378_eqF711 L6122713850Q3M378_eqF7132794289Q3M378_eq#N/A2806676Q3M378_eq#N/A4113324Q3M378_eq#N/A6679787Q3M378_eqF8036718686Q3M378_eqF8156746675Q3M378_eq#N/A6753100Q3M378_eqY10386986250Q3M378_eq#N/A

13

Whole Y Chromosome60M bps longKarmin et al:We exclude all oF Chr Y outside 10.8-Mb sequence>5x unique coverageFTDNA:Around 11.5 to 12.5 million base-pairs of reliably mapped positions of non-recombining Y chromosome14

In which Capture

Message No 1:

Know the capture you are choosing!

15

Intra-platform performance16

Inter platform performance

Genotyping platforms:Complete GenomicsIllumina5 samples were run in both platformsThe overlapping region is 6M bpAre we identifying the same variants?17

Inter-platform performance18

Message No 2:

Different platform are overall concordant !

19

Nomenclature

20

What is a reference genome?The reference genome does not represent the ancestral genome!

The reference genome represent a haploid mosaic of different DNA sequences from different donors. For example, GRCh37, the Genome Reference Consortium human genome (build 37) is derived from thirteen anonymous volunteers from Buffalo, New York.

Accordingly, the Y chromosome sequence is an assembly of a few haplogroups.21

Root vs Reference

22

22

Genome buildsRelease nameDate of releaseEquivalent UCSC versionGRCh38Dec 2013hg38GRCh37Feb 2009hg19NCBI Build 36.1Mar 2006hg18NCBI Build 35May 2004hg17NCBI Build 34Jul 2003hg16

23The same variant can be in different positions in different genome builds.

Same variant Different name

24

Message No 3:

Speak the language!

25

Whole Y sequencing

Whole Y Sequencing27GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATGTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAAAATTTCCACCAGAGCCGGAGCACCTTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCGAGCCGGAGCACCTTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCGTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAAAATTTCCACCATAACTAAGCTATACTAACCCCAGGGTTGGTCAATTTCGTGCCAGCCACCGCACCCCCACGGGAAACAGCAGTGATTAACCTTTAGCAATAAACGAAAGTTGACAAGCATCAAGCACGCAGCAATGCAGCTCAAAACGCTTAGCCTAGCCAAGCATCCCCGTTCCAGTGAGTTCACCCTCTAAATCACCACGATCAAAAGGTAGGTTTGGTCCTAGCCTTTCTATTAGCTCTTAGTAAGATTACACATGCAAGCAATACACTGAAAATGTTTAGACGGGCTCACATCACCCCATAAACAAAAACCAAACCCCAAAGACACCCCCCACAGTTTATGTAGCTTACCTCCTCAACCCATCCTACCCAGCACACACACACCGCTGCTAACCCCATACCCCGAACCTATTTTCCCCTCCCACTCCCATACTACTAATCTCATCAATACAACCCTCGTTATCTTTTGGCGGTATGCACTTTTAACAGTCACCCCCCAACTAACACATAAACCCCAAAAACAAAGAACCCTAACACCAGCCTAACCAGATTTCAAATTAACCCCCCCTCCCCCCGCTTCTGGCCACAGCACTTAAACACATCTCTGCCCGGTCACACGATTAACCCAAGTCAATAGAAGCCGGCGTAAAGAGTGTTTTAGATCACCCCCTCCCCAATAAAGCTAAAACTCACCTGAGTTGTAAAAAAC

Whole Y Sequencing

Now, we have to alignY chromosome reference

AlignmentY chromosome reference

Coverage=3

REFGALTAGenotypeHOMdepth60qual_base_calling214maxqual_mapping60maxqual_genotype99max

Quality control per variant

31

Intra-platform performance32

Message No 4:

Not all positions within the capture will pass QC!

33

Pipeline for Whole Y analysis

VCF (Variant Call Format)VariantGRC38 positionReferenceDerivedVariantGRC38 positionReferenceDerivedP3052842113GAM23113357844GAL10852922685TCM21413360045TCZ47623953196ACM21313414871TCV1715030624CGL113014549130TGM5236885478AGP1415286718CTM5227305102GAV16815835792GAM5787334662CTL72917319728ACL6667702775GAF54917401190CTV2217721262GTF316319069977GAM23087822141ATM919568371CGF11548513272TCM4219704954ATF12068572376CTM8919755427CTF13298720990CTL115520029380GCP14312077161GAL73520977731GTM16812702062CTM52621389038ACP9712774339GTF65021455120GAP10813314368CT

35

VCF (Variant Call Format)VariantGRC38 positionReferenceDerivedVariantGRC38 positionReferenceDerivedP3052842113GAM23113357844GAL10852922685TCM21413360045TCZ47623953196ACM21313414871TCV1715030624CGL113014549130TGM5236885478AGP1415286718CTM5227305102V16815835792GAM5787334662CTL72917319728ACL6667702775GAF54917401190CTV2217721262GTF316319069977GAM23087822141ATM919568371CGF11548513272TCM4219704954ATF12068572376CTM8919755427CTF13298720990CTL115520029380GCP14312077161GAL73520977731GTM16812702062CTM52621389038ACP9712774339GTF65021455120GAP10813314368CT

36The position failed, nothing to worry about

The position did not fail and shows the reference which means it is a private back mutation!

Haplogroup labeling37A0'1'2'3'4-L1085*(xV148,V168) A1'2'3'4-V168*(xM31,P108) A2'3'4-P108*(xL419,M42) A4-M42*(xM60,M168) CDEF-M168*(xM145,P143) CF-P143*(xM130,M89) F-M89*(xF1329) GHIJKLT-F1329*(xM201,M578) HIJKLT-M578*(xL901,M522) IJKLT-M522*(xM429,M9) KLT-M9*(xM526,P326) K-M526*(xP331,F549) X-F549*(xM214) NO-M214*(xM231,M175) N-M231*(xY6503,L735) N-L735*(xF2930,L729) N-L729*(xL666,M46) N-L666*(xF1154,P43)

N-F1154

Private mutations38A0'1'2'3'4-L1085*(xV148,V168) A1'2'3'4-V168*(xM31,P108) A2'3'4-P108*(xL419,M42) A4-M42*(xM60,M168) CDEF-M168*(xM145,P143) CF-P143*(xM130,M89) F-M89*(xF1329) GHIJKLT-F1329*(xM201,M578) HIJKLT-M578*(xL901,M522) IJKLT-M522*(xM429,M9) KLT-M9*(xM526,P326) K-M526*(xP331,F549) X-F549*(xM214) NO-M214*(xM231,M175) N-M231*(xY6503,L735) N-L735*(xF2930,L729) N-L729*(xL666,M46) N-L666*(xF1154,P43)

N-F1154Private mutations:g.2654329C>Tg.4448652G>Ag.7598733A>G

Establishing a new branchJohn Smith39N-M231N-L735N-F1206N-F1154N-F3163

g.2654329C>Tg.4448652G>Ag.7598733A>GMike SmithN-M231N-L735N-F1206N-F1154N-F3163

g.3447764C>Tg.6853865A>G

John Smith40N-M231N-L735N-F1206N-F1154N-F3163

g.2654329C>Tg.4448652G>Ag.7598733A>GMike Smithg.3447764C>Tg.6853865A>G

~1400 ybp

Establishing a new branchJohn Smith41N-M231N-L735N-F1206N-F1154N-F3163

g.2654329C>Tg.4448652G>Ag.7598733A>GMike SmithN-M231N-L735N-F1206N-F1154N-F3163

g.3447764C>Tg.4448652G>Ag.6853865A>G

John Smith42N-M231N-L735N-F1206N-F1154N-F3163

g.2654329C>Tg.7598733A>GMike Smithg.3447764C>Tg.6853865A>G

g.4448652G>A

~1000 ybp

Message No 5:

Help is on the way! These features will be released during 2017!

43

TartuEstonian BiocentreLauri SaagMonika KarminHovhannes SahakyanEne MetspaluMait Metspalu Siiri RootsiRichard VillemsAcknowledgementsGenealogical peersAll Big Y friendsFamily Tree DNAConnie BormansLuisa Fernanda SanchezBrent ManningElliott GreenspanBennett Greenspan

44