using genetics to study human history and natural selection

49
Using genetics to study human history and natural selection David Reich Harvard Medical School Depatment of Genetics Broad Institute

Upload: kert

Post on 28-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Using genetics to study human history and natural selection. David Reich Harvard Medical School Depatment of Genetics Broad Institute. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using genetics to study human history and natural selection

Using genetics to study human history and natural selection

David ReichHarvard Medical School Depatment of Genetics

Broad Institute

Page 2: Using genetics to study human history and natural selection

tttctccatttgtcgtgacacctttgttgacaccttcatttctgcattctcaattctatttcactggtctatggcagagaacacaaaatatggccagtggcctaaatccagcctactaccttttttttttttttgtaacattttactaacatagccattcccatgtgtttccatgtgtctgggctgcttttgcactctaatggcagagttaagaaattgtagcagagaccacaatgcctcaaatatttactctacagccctttataaaaacagtgtgccaactcctgatttatgaacttatcattatgtcaataccatactgtctttattactgtagttttataagtcatgacatcagataatgtaaatcctccaactttgtttttaatcaaaagtgttttggccatcctagatatactttgtattgccacataaatttgaagatcagcctgtcagtgtctacaaaatagcatgctaggattttgatagggattgtgtagaatctatagattaattagaggagaatgactatcttgacaatactgctgcccctctgtattcgtgggggattggttccacaacaacacccaccccccactcggcaacccctgaaacccccacatcccccagcttttttcccctgctaccaaaatccatggatgctcaagtccatataaaatgccatactatttgcatataacctctgcaatcctcccctatagtttagatcatctctagattacttataatactaataaaatctaaatgctatgtaaatagttgctatactgtgttgagggttttttgttttgttttgttttatttgtttgtttgtttgtattttaagagatggtgtcttgctttgttgcccaggctggagtgcagtggtgagatcatagcttactgcagcctcaaactcctggactcaaacagtcctcccacctcagcctcccaaagtgctgggatacaggtgtgacccactgtgcccagttattattttttatttgtattattttactgttgtattatttttaattattttttctgaatattttccatctatagttggttgaatcatggatgtggaacaggcaaatatggagggctaactgtattgcatcttccagttcatgagtatgcagtctctctgtttatttaaagttttagtttttctcaaccatgtttacttttcagtatacaagactttgacgttttttgttaaatgtatttgtaagtattttattatttgtgatgttatttaaaaagaaattgttgactgggcacagtggctcacgcctgtaatcccagcactttgggaggctgaggcgggcagatcacgaggtcaggagatcaagaccatcctggctaacatggtaaaaccccgtctctactaaaaatagaaaaaaattagccaggcgtggtggcgagtgcctgtagtcccagctactcgggaggctgaggcaggagaatggtgtgaacctgggaggcggagcttgcagtgagctgagatcgtgccactgcattccagcctgcgtgacagagcgagactctgtcaaaaaaataaataaaatttaaaaaaagaagaagaaattattttcttaatttcattttcaggttttttatttatttctactatatggatacatgattgatttttgtatattgatcatgtatcctgcaaactagctaacatagtttattatttctctttttttgtggattttaaaggattttctacatagataaataaacacacataaacagttttacttctttcttttcaacctagactggatgcattttttgtttttgtttgtttgtttgctttttaacttgctgcagtgactagagaatgtattgaagaatatattgttgaacaaaagcagtgagagtggacatccctgctttccccctgattttagggggaatgttttcagtctttcactatttaatatgattttagctataggtttatcctagatccctgttatcatgttgaggaaattcccttctatttctagtttgttgagattttttaattcatgtgattgcgctatctggctttgctctca

tc

ga

ga

ga

ga

ga

gc

gc

gc

tc

ga

ga

ga

ga

ga

tc

tc

tc

tc

ga

ga

ga

tc

gc

tc

tc

tc

Page 3: Using genetics to study human history and natural selection

A 2-part talk:

Section 1: How human history affects human genetic variation

Section 2: Detecting selection by the pattern of genetic variation and finding disease genes

Page 4: Using genetics to study human history and natural selection

How does human history affect genetic variation?

A genome-wide survey of Linkage Disequilibrium

Section 1

Linkage disequilibrium is a phenomenon whereby genetic variants are associated: people who have one tend to have a second as well

Page 5: Using genetics to study human history and natural selection

Linkage Disequilibrium Explained

Variations in Chromosomes Within a Population

Common Ancestor

Emergence of Variations Over Time

time present

Disease Mutation

Section 1

Page 6: Using genetics to study human history and natural selection

Time = present

What Determines Extent of LD?

2,000 gens. ago

Disease-Causing Mutation

1,000 gens. ago

Section 1

Page 7: Using genetics to study human history and natural selection

How Far Does Association (LD) Extend Between Neighboring Common Sites?

0kb160kb

80kb40kb20kb10kb5kb

Range of uncertainty

Section 1

• Theoretical: 3-8 kb

Page 8: Using genetics to study human history and natural selection

Strategy for Assessing Extent of LD

• 19 regions• 44 Caucasian samples from Utah• a great deal of DNA sequencing per sample

Distance from core single nucleotide polymorphism (SNP)

5 5 10 20 40 80

Section 1

0kb160kb

80kb40kb20kb10kb5kb

Page 9: Using genetics to study human history and natural selection

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2.8

Distance Between SNPs (Base Pairs)

Lin

ka

ge

Dis

eq

uil

ibri

um

|D'|

10kb5kb 20kb 80kb40kb 160kb unlinked1kb

Data

Previous Theoretical Prediction

Section 1

Page 10: Using genetics to study human history and natural selection

A Genome-Wide Assessment of Linkage Disequilibrium

Disease Gene Mapping

Human history

Section 1

Page 11: Using genetics to study human history and natural selection

MYSTERY: What explains the long-range LD?

Section 1

Important event in population history?

Page 12: Using genetics to study human history and natural selection

Positive Control: 48 Swedes

Identical pattern to Utah

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

3.5

Distance Between SNPs (Base Pairs)

Lin

ka

ge

Dis

eq

uili

bri

um

D'

10kb5kb 20kb 80kb40kb 160kb

Utah LD Curve

Sweden LD

Sweden LD With Sign of D' set by Utah

Section 1

Page 13: Using genetics to study human history and natural selection

96 Nigerians (Yoruba)

Much Less LD

Associations in Africans a SUBSET of those in Caucasians

MUST be influenced by population history

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

3.5

Distance Between SNPs (Base Pairs)

Lin

ka

ge

Dis

eq

uili

bri

um

D'

10kb5kb 20kb 80kb40kb 160kb

Utah LD Curve

Nigeria LD

Nigeria LD with sign of D' set by Utah

Section 1

Page 14: Using genetics to study human history and natural selection

Confirmation of less LD in Africans from Direct DNA Sequencing

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

500bp 5kb 10kb 20kb 40kb 80kb 160kb

Mea

n |D

'|

Nigerian

Utah

101

313

67

56

83

9816

174

86 6

48

4

6320

Anna DiRienzo also shows this pattern

Section 1

Page 15: Using genetics to study human history and natural selection

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 50,000 100,000 150,000

Distance (bp)

Me

an

|D

'|CaucasianAfrican-AmericanAsianYoruban

More evidence from Genotyping~5,000 SNPs (Gabriel et al. 2002)

K. Kidd, J. Kidd, Sarah Tishkoff also show this

Section 1

Page 16: Using genetics to study human history and natural selection

Explanation: Bottleneck or ‘Founder Effect’ in History of North Europeans

What was this event?

(1) Out of Africa?

Ancestral Population

North Europeans

• likely <10 founding

chromosomes ~100,000years ago

YorubaAncestors

Section 1

(2) Founding of Europe?

Page 17: Using genetics to study human history and natural selection

Open Mysteries

Section 1

• what caused the bottleneck event?

“Out of Africa” migration?

• how many people involved? When did it occur?

• can we better understand when the founder

event occurred, and how many people involved?

Page 18: Using genetics to study human history and natural selection

Acknowledgements for Section 1

Collaborators:Michele CargillStacey BolkJames IrelandPardis C. SabetiDaniel J. RichterThomas LaveryRose KouyoumjianShelli F. FarhadianRyk WardEric S. Lander

Samples:Leif GroopRichard CooperCharles Rotimi

Page 19: Using genetics to study human history and natural selection

Using Long-Range Linkage Disequilibrium to Detect Positive Selectionin the Genome

Section 2

Page 20: Using genetics to study human history and natural selection

Overview

1. The difficulty of detecting genomic regions affected by natural selection

2. The long-range haplotype test

3. Results for two genes: G6PD and CD40 ligand

Section 2

Page 21: Using genetics to study human history and natural selection

Existing formal tests for selection

DNA Sequence analysis Tajima’s D HKA test Mcdonald and Kreitman Fu and Li’s D Ka/Ks ratio

Weak

Genotyping-based tests Not general at present

Section 2

Page 22: Using genetics to study human history and natural selection

Old alleles: • low or high frequency • short-range LD

Positive Selection

Our test is based on the relationship betweenallele frequency and extent of linkage disequilibrium

Young alleles: • low frequency • long-range LD

No selection

Young alleles: • high frequency • long-range LD

Section 2

Page 23: Using genetics to study human history and natural selection

The signal of selection

frequency

Link

age

Dis

equi

libriu

m

(Hom

ozyg

osity

)

Neutrality

Positive Selection

Section 2

Page 24: Using genetics to study human history and natural selection

gene

Paradigm of the Core Region

5

3

2

1

4

Core Haplotypes

Section 2

Page 25: Using genetics to study human history and natural selection

Long-range multi-SNP haplotypes

5

3

2

1

4

C/T A/G A/G C/T C/T C/T

Long-range markersCoremarkers

gene

Decay of LD

Section 2

Page 26: Using genetics to study human history and natural selection

Long-range multi-SNP haplotypes

100%

Decay of homozygosity

(probability, at any distance, that any two haplotypes that start out the same have all the same SNP genotypes) 18%

gene

C/T A/G A/G C/T C/T C/T

Coremarkers

Long-range markers

G G

C

C

C

C

T

T

T

T

C

T

75% 35%

T TC

C

A G

3

Section 2

Page 27: Using genetics to study human history and natural selection

CD40 ligand (2002):• Recent association by Sabeti et al.

• involved in immune regulation

Two genes associated with malaria resistance

• well established association to malaria resistance

G6PD (1960’s)

• selection demonstrated in 2001 by Tishkoff et al.

Section 2

Page 28: Using genetics to study human history and natural selection

Experimental Design

-180kb Gene +520kb

CD40 ligand (7 SNPs in core, 14 at long distances)

-480kb G6PD +220kb

-180kb TNFSF5 +520kb

telomere

-480kb Gene +220kb

telomere

G6PD (11 SNPs in core, 14 at long distances)

Section 2

Page 29: Using genetics to study human history and natural selection

Experimental Design

DNA samples from 231 African menYoruba (Nigeria)Beni (Nigeria)Shona (Zimbabwe)

Perfect phase (X chromosome)

Section 2

Page 30: Using genetics to study human history and natural selection

Core haplotypesG6PD

5

3

2

1

4

Africans(230)

6

7

8

9

38 72 428281441 5

46113 17

non-Africans(95)

CD40 ligand

591 97830 1

5

3

2

1

4

6

Africans(231)

77 21 7 7

non-Africans(91)

“A-” protective haplotype

Section 2

Page 31: Using genetics to study human history and natural selection

G6PD: long-range haplotype diversity

G6PD-corehap1 G6PD-corehap6

G6PD-corehap3 G6PD-corehap7

G6PD-corehap4 G6PD-corehap8

G6PD-corehap5 G6PD-corehap

G6PD-corehap8“A-” protectivehaplotype

Section 2

Page 32: Using genetics to study human history and natural selection

G6PD: homozygosity vs. distanceE

HH

Distance from the core region (kb)

Section 2

Page 33: Using genetics to study human history and natural selection

G6PD: computer simulation vs. data

Core haplotype frequency

Rel

ativ

e E

HH

Core haplotype 8P << 0.0008

Section 2

Page 34: Using genetics to study human history and natural selection

G6PD: P-values from simulationP

- val

ue

Distance from the core region (kb)

Section 2

Page 35: Using genetics to study human history and natural selection

G6PD also stands out in comparison to 7 control regions

Core haplotype frequency

Rel

ativ

e E

HH

Section 2

Page 36: Using genetics to study human history and natural selection

CD40 ligand:long-range haplotype diversity

corehap1 corehap4

corehap2 corehap5

corehap3

corehap4

Section 2

Page 37: Using genetics to study human history and natural selection

CD40 ligand: homozygosity vs. distanceE

HH

Distance from the core region (kb)

Section 2

Page 38: Using genetics to study human history and natural selection

CD40 ligand: computer simulation vs. data

Core haplotype frequency

Rel

ativ

e E

HH

Core haplotype 4P << 0.0011

Section 2

Page 39: Using genetics to study human history and natural selection

CD40 ligand: P-values from simulationP

- val

ue

Distance from the core region (kb)

Section 2

Page 40: Using genetics to study human history and natural selection

CD40 ligand also stands out incomparison to 7 control regions

Core haplotype frequency

Rel

ativ

e E

HH

Section 2

Page 41: Using genetics to study human history and natural selection

Malaria resistance arosein last 10,000 years in Africa

~2,500 years ago for G6PD

~6,500 years ago for CD40 ligand

Long-range linkage disequilibrium also gives a direct estimate of the date

Section 2

Page 42: Using genetics to study human history and natural selection

Traditional tests fail to detect the effect

Tajima’s D HKA test Mcdonald and Kreitman Fu and Li’s D Ka/Ks ratio

Not significant in our data. This test is a powerful way to detect selection in last 10,000 years

Section 2

Page 43: Using genetics to study human history and natural selection

3

2

1

4

Conclusions: Powerful general approach for detecting selection

Section 2

Page 44: Using genetics to study human history and natural selection

3

2

1

4

5

Conclusions: Powerful general approach for detecting selection

Section 2

Page 45: Using genetics to study human history and natural selection

3

2

1

4

Screen the genome for Postive Selection

Conclusions: Powerful general approach for detecting selection

Section 2

Page 46: Using genetics to study human history and natural selection

Conclusions: Genome-wide screen for natural selection

We can find disease genes without patients!

Section 2

Page 47: Using genetics to study human history and natural selection

What’s coming…Section 2

1. Generalization of the long-range haplotype test

2. Application of the approach genome-wide

• Haplotype map data set

• Disease gene screen data sets

Page 48: Using genetics to study human history and natural selection

Acknowledgements for Section 2Pardis C. SabetiJohn HigginsHaninah Z.P. LevineDaniel J. RichterStephen F. SchaffnerStacey GabrielJill V. PlatkoNicholas J. Patterson

Gavin J. McDonaldHans C. AckermanSarah J. CampbellDavid AltshulerRichard CooperRyk WardEric S. Lander

Page 49: Using genetics to study human history and natural selection

Note

The 3rd section of the talk is not included here because it presents data that have not yet been published.