lecture 13: population structure october 8, 2012
TRANSCRIPT
![Page 1: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/1.jpg)
Lecture 13: Population Structure
October 8, 2012
![Page 2: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/2.jpg)
Last Time
Effective population size calculations
Historical importance of drift: shifting balance or noise?
Population structure
![Page 3: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/3.jpg)
Today Course feedback
The F-Statistics
Sample calculations of FST
Defining populations on genetic criteria
![Page 4: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/4.jpg)
Midterm Course Evaluations Based on five responses: It’s not
too late to have an impact!
Lectures are generally OK
Labs are valuable, but better organization and more feedback are needed
Difficulty level is OK
Book is awful
![Page 5: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/5.jpg)
F-Coefficients
Quantification of the structure of genetic variation in populations: population structure
Partition variation to the Total Population (T), Subpopulations (S), and Individuals (I)
TS
![Page 6: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/6.jpg)
F-Coefficients
Combine different sources of reduction in expected heterozygosity into one equation:
)1)(1(1 ISSTIT FFF Deviation due to subpopulation differentiation
Overall deviation from H-W expectations
Deviation due to inbreeding within populations
![Page 7: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/7.jpg)
F-Coefficients and IBD
View F-statistics as probability of Identity by Descent for different samples
)1)(1(1 ISSTIT FFF
Overall probability of IBD
Probability of IBD for 2 individuals in a subpopulation
Probability of IBD within an individual
![Page 8: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/8.jpg)
F-Statistics Can Measure Departures from Expected Heterozygosity Due to Wahlund Effect
S
ISIS H
HHF
T
ITIT H
HHF
T
STST H
HHF
where
HT is the average expected heterozygosity in the total
population
HI is observed heterozygosity
within a subpopulation
HS is the average expected
heterozygosity in subpopulations
![Page 9: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/9.jpg)
Calculating FST
Recessive allele for flower color
White: 10, Dark: 10
White: 2, Dark: 18
B2B2 = white; B1B1 and B1B2 = dark pink
Subpopulation 1:
F(white) = 10/20 = 0.5
F(B2)1 = q1= 0.5 = 0.707
p1=1-0.707 = 0.293
Subpopulation 2:
F(white)=2/20=0.1
F(B2)2 = q2 = 0.1 = 0.32
p2 = 1-0.32 = 0.68
![Page 10: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/10.jpg)
Calculating FST
For 2 subpopulations:
HS = Σ2piqi/2 = (2(0.707)(0.293) + 2(0.32)(0.68))/2
HS= 0.425
Calculate Average HE of Subpopulations (HS)
White: 10, Dark: 10
White: 2, Dark: 18
Calculate Average HE for Merged Subpopulations (HT):
F(white) = 12/40 = 0.3
q = 0.3 = 0.55; p=0.45
HT = 2pq = 2(0.55)(0.45)
HT = 0.495
![Page 11: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/11.jpg)
Bottom Line:
White: 10, Dark: 10
White: 2, Dark: 18
FST = (HT-HS)/HT =
(0.495 - 0.425)/ 0.495 = 0.14
14% of the total variation in flower color alleles is due to variation among populations
AND
Expected heterozygosity is increased 14% when subpopulations are merged (Wahlund Effect)
![Page 12: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/12.jpg)
Nei's Gene Diversity: GST
Nei's generalization of FST to multiple, multiallelic loci
Where HS is mean HE of m subpopulations, calculated for n alleles with frequency of pj
T
STST H
DG
)1(1
1 1
2
m
i
n
jjS p
mH
STST HHD
Where pj is mean allele frequency of allele j over all subpopulation
![Page 13: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/13.jpg)
Unbiased Estimate of FST
Weir and Cockerham's (1984) Theta
Compensates for sampling error, which can cause large biases in FST or GST (e.g., if sample represents different proportions of populations)
Calculated in terms of correlation coefficients
Calculated by FSTAT software:
http://www2.unil.ch/popgen/softwares/fstat.htm
Goudet, J. (1995). "FSTAT (Version 1.2): A computer program to calculate F- statistics." Journal of Heredity 86(6): 485-486.
Often simply referred to as FST in the literature
Weir, B.S. and C.C. Cockerham. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.
![Page 14: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/14.jpg)
Linanthus parryae population structure Annual plant in Mojave desert is classic example
of migration vs drift
Allele for blue flower color is recessive
Use F-statistics to partition variation among regions, subpopulations, and individuals
FST can be calculated for any hierarchy:
FRT: Variation due to differentiation of regions
FSR: Variation due to differentiation among subpopulations within regions
Schemske and Bierzychudek 2007 Evolution
![Page 15: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/15.jpg)
Linanthus parryae population structure
![Page 16: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/16.jpg)
Hartl and Clark 2007
R
SRSR H
HHF
T
RTRT H
HHF
T
STST H
HHF
![Page 17: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/17.jpg)
FST as Variance Partitioning Think of FST as proportion of genetic variation
partitioned among populations
qp
qVFST
)(
where
V(q) is variance of q across subpopulations
Denominator is maximum amount of variance that could occur among subpopulations
![Page 18: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/18.jpg)
Analysis of Molecular Variance (AMOVA) Analogous to Analysis of Variance
(ANOVA)
Use pairwise genetic distances as ‘response’
Test significance using permutations
Partition genetic diversity into different hierarchical levels, including regions, subpopulations, individuals
Many types of marker data can be used
Method of choice for dominant markers, sequence, and SNP
![Page 19: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/19.jpg)
Phi Statistics from AMOVA
http://www.bioss.ac.uk/smart/unix/mamova/slides/frames.htm
222
2
cba
aCT
Correlation of random pairs of
haplotypes drawn from a region relative to pairs drawn
from the whole population (FRT)
22
2
cb
bSC
Correlation of random pairs of
haplotypes drawn from an individual subpopulation relative to pairs drawn
from a region (FSR)
222
22
cba
baST
Correlation of random pairs of haplotypes drawn from an individual
subpopulation relative to pairs drawn from the whole population
(FST)
![Page 20: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/20.jpg)
What if you don’t know how your samples are organized into populations (i.e., you
don’t know how many source populations you have)?
What if reference samples aren’t from a single
population? What if they are offspring from parents
coming from different source populations (admixture)?
![Page 21: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/21.jpg)
What’s a population anyway?
![Page 22: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/22.jpg)
Defining populations on genetic criteria
Assume subpopulations are at Hardy-Weinberg Equilibrium and linkage equilibrium
Probabilistically ‘assign’ individuals to populations to minimize departures from equilibrium
Can allow for admixture (individuals with different proportions of each population) and geographic information
Bayesian approach using Monte-Carlo Markov Chain method to explore parameter space
Implemented in STRUCTURE program:
http://pritch.bsd.uchicago.edu/structure.html
Londo and Schaal 2007 Mol Ecol 16:4523
![Page 23: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/23.jpg)
Example: Taita Thrush data*
Three main sampling locations in Kenya
Low migration rates (radio-tagging study)
155 individuals, genotyped at 7 microsatellite loci
Slide courtesy of Jonathan Pritchard
![Page 24: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/24.jpg)
![Page 25: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/25.jpg)
![Page 26: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/26.jpg)
![Page 27: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/27.jpg)
![Page 28: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/28.jpg)
![Page 29: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/29.jpg)
Estimating K
Structure is run separately at different values of K. The program computes a statistic that measures the fit of each value of K (sort of a penalized likelihood); this can be used to help select K.
Taita thrush data1122334455
~0 ~0 ~0 ~0 0.9930.993 0.007 0.007 0.000050.00005
Assumed Assumed value of value of KK
Posterior Posterior probability of probability of KK
![Page 30: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/30.jpg)
Another method for inference of K
The K method of Evanno et al. (2005, Mol. Ecol. 14: 2611-2620):
Eckert, Population Structure, 5-Aug-2008 46
![Page 31: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/31.jpg)
Inferred population structure
Each individual is a thin vertical line that is partitioned into K colored segments according to its membership coefficients in K clusters.
Africans Europeans MidEast Cent/S Asia Asia Oceania America
Rosenberg et al. 2002 Science 298: 2381-2385
![Page 32: Lecture 13: Population Structure October 8, 2012](https://reader036.vdocuments.site/reader036/viewer/2022062423/5697bf891a28abf838c8a32b/html5/thumbnails/32.jpg)
Inferred population structure – regions
Rosenberg et al. 2002 Science 298: 2381-2385