non-parametric linkage analysis
DESCRIPTION
Non-parametric Linkage Analysis. IBD vs IBS Affected Sib Pair (ASP) Method Affected Pedigree Member (APM) Method TDT Homozygosity Mapping Case Study 1. 2006. 12. 3 Haseong Kim BIBS. SNU. Reference : 2006 Asian Institute in Statistical Genetics and Genomics. - PowerPoint PPT PresentationTRANSCRIPT
Non-parametric Linkage Analysis
– IBD vs IBS– Affected Sib Pair (ASP) Method– Affected Pedigree Member (APM) Method– TDT– Homozygosity Mapping– Case Study 1
2006. 12. 3 Haseong KimBIBS. SNU.
Reference : 2006 Asian Institute in Statistical Genetics and Genomics
Non-parametric MethodsIBD vs IBS
• IBD : Identity – By – Descent– You can tell whether or not alleles in two or
more individuals have been inherited form a common ancestor
• IBS : Identity – By – State– You can only tell whether or not alleles in two
or more individuals are the same
IBD vs IBS
12 13
13 13 2 213 11 1 113 12 0 123 11 0 0 13 13 2 2
13 11 1 113 12 0 123 11 0 1
IBD IBS
IBD IBS
3412
14 231312
Sib-Pair Analysis• Test for excess sharing of alleles(IBD) in affected sib-pairs• On average, siblings share 50% of their genes in common• Since siblings can only 2,1, or 0 genes in common : 2:1/4, 1:2/4, 0:1/4
Aa Bb
AB AB Ab Ab aB aBab ab
AB Ab aB ab
AB 2 1 1 0
Ab 1 2 0 1
aB 1 0 2 1
ab 0 1 1 2
Sip-Pair AnalysisDefferent Approaches
• Simple Counting
• Regression
• Maximum Likelihood
Sip-pair• HLA sharing (Cox & Spielman et al., 1989)
2 1 0 Total
Diabetes 81 46 10 137
Expected value 34.25 68.5 34.25 137
df) (2 )( 2
2
E
EO
H0: No linkage between marker & disease
H0: Sib-pairs share 0, 1, or 2 alleles IBD in proportions of 0.25, 0.5, 0.25
at marker locus.H0: (Z0, Z1, Z2) = (0.25, 0.5, 0.25)
HA: Linkage
HA: Excess 2 alleles IBD sharing between affected sib-pairs than expected (>¼ )
HA: (Z0, Z1, Z2) ≠ (0.25, 0.5, 0.25) - reference : Park, Ji Wan -
The insulin gene and susceptibility to IDDMDr. Nancy J. Cox 1 2 *, Richard S. Spielman 2
• The association between insulin-dependent diabetes mellitus (IDDM) and an allele of a restriction fragment length polymorphism (RFLP) 5’ to the coding region of the insulin gene has raised the possibility that variation in the vicinity of the insulin gene confers susceptibility to IDDM.
• To test this hypothesis, the distribution of insulin gene sharing in affected sib pairs (ASPs) from the Genetic Analysis Workshop 5 (GAW5) families has been compared with that expected on the basis of random assortment.
• There is no deviation from random expectation in insulin gene sharing among 95 ASPs from families fully informative for the insulin gene.
• This is also true when insulin gene sharing is conditioned on HLA sharing, on the particular HLA DR types in ASPs, or on the parents' insulin allele classes.
• These results thus provide no evidence that variation at or near the insulin gene confers susceptibility to IDDM.
• However, we also used computer simulation to investigate how the insulin gene region could contribute susceptibility to IDDM without yielding evidence for distortion in insulin gene sharing in a sample comparable to that of GAW5.
• We found that various levels of insulin gene involvement in IDDM could generate a population association between the insulin gene RFLP and IDDM comparable to that reported in the literature, without producing significant distortion in insulin gene sharing of ASPs.
Sib-PairQuantitative Trait
• Difference in trait value between sibs varies inversely with the proportion of shared susceptibility genes.
• Can be tested using linear regression• Haseman-Elston approach regresses the squared trait difference on the prop
ortion of marker alleles shared IBD
- Reference : Park, Ji Wan -
varianceadditive: , phenotype of variance: ),,(: ,:
)2
1(2)1(2|)(
2221
22221
aijjj
jajjj
XXXCorrerrore
erMXX
Original H-E
Sib-Pair AnalysisMaximum Likelihood Method
• The likelihood that sibs share more than half their alleles IBD / The likelihood that sibs share half their alleles IBD
L(IBD>0.5 | Sib-Pairs)
-------------------------------
L(IBD=0.5 | Sib-Pairs)
Sib-Pair Analysis
• Parameters needed for each locus
None!
Sib-Pairs
• Pros– Easy to collect– Underlying genetics need not be known
• Cons– Restricted family structure– Need to know (or estimate) IBD status– Can be sensitive to outliers (H-E)
Sib-Pair
• Can be used when inheritance is known, but
• Sibs are not always available !
• IBD is not always calculable !
Affected – Pedigree – Member• Since the likelihood of an affected relative-set cannot be
written in terms of the IBD probabilities of an affected sib-pair, the extension of likelihood methods requires the introduction of additional parameters.– Assume a ‘Mendelian model’ with ‘genetic parameters’ such as
p, f0, f1, f2, theta. allele frequency, penetrance parameters, recomb.• Tests for excess sharing of alleles (IBS)• Expected level of sharing is dependent on relationship
Sibs 0.50Uncle/neice 0.25Grandparent/grandchild 0.25First cousins 0.125
• Parameters needed for each locus– Pedigree structure– Markers allele frequencies
Affected – Pedigree – Member• Weeks and Lange, 1988
1/ 4[ ( , ) ( ) ( , ) ( ) ( , ) ( ) ( , ) ( )]i k i i l i j k j j l jZ A A f A A A f A A A f A A A f A
where0 X and Y are IBS
( , )1 X and Y are not IBS
X Y
1 1/ 2( ) is weight (e.g. or )f X p p
Since the sharing of a rare allele is more ‘significant’ than the sharing of a common allele
The Z values of all affected relative pairs in a pedigree are added to give a total measure of allele-sharing among affected members of the pedigree
2 1/ 2
( )
( )
m m m
m
m m
m
W Z MT
W V
where rm is the number of affected relatives, Zm is the measure of allele-sharing, Mm and Vm are the theoretical mean and variance of Zm , , Wm = (rm-1)1/2 / Vm 1/2
T is asymptotically standard normal and provides a test for linkage based on excessive IBS allele-sharing - Ref : Pak Sham p112 -
Affected – Pedigree - Member
• Pros– Can use affected relatives other than sibs – Can be used when underlying genetics
unknown
• Cons– Loss of power compared to IBD– Sensitivity to marker allele frequency
estimates
Linkage vs. Association Studies• Linkage Studies
– Looks for excess sharing of genomic regions defined by marker loci– Sharing occurs within families– LINKAGE pertains to loci. It tells us how close the marker
locus is to the disease/trait locus.• Association Studies
– Looks for excess sharing of alleles at a single locus– Sharing occurs between unrelated individuals– ASSOCIATION pertains to alleles. It tells us how a
PARTICULAR allele at a marker locus is co-inherited with the allele predisposing to high risk of the disease.
• ASSOCIATION exists within much smaller distances on the genome compared to LINKAGE.
• ASSOCIATION is a more powerful tool for mapping genes, but will give significant results ONLY IF the marker locus is VERY CLOSE to the disease locus.
Example of the TDT
12 12
11
• Trios (parents and an offspring) such that at least one parent is heretozygous and the offspring is affected.
• Genotypes of both parents and the affected offspring.• H0: no linkage or no association vs H1: linkage and association
Not transmitted
Allele1 Not Allele1
TransmittedAllele1 a b
Not Allele1 c d
• a and d refer to transmissions from homozygous parents. These contain information on association but not on linkage.
• b and c refer to transmissions from heterozygous parents. These contain information on both linkage and association.
• TDT would be able to detect linkage only in the presence of allelic association.• TDT protects against population stratification because in a case-control framewor
k, the transmitted allele acts as a case and the non-transmitted allele acts as a control. - Ref : Saurabh Ghosh -
test statistic = (b-c)2/(b+c) ~ x2(1)
Problem• Gene1 : black;B > brown;b & Gene 2 : full color;F > chinchilla;f
???? ????
31brown, chinchilla
35black, full
16brown, full
19black, chinchilla
??%
Gene 1 Gene 2
B(b)
B(b) F(f)
F(f)
Problem• Gene1 : black;B > brown;b & Gene 2 : full color;F > chinchilla;f
?b?f ?b?f
31brown, chinchilla
bbff
35black, full
BBFFBBFfBbFFBbFf
16brown, full
bbFFbbFf
19black, chinchilla
BBffBbff
black / brown = 1.14full / chinchilla = 1.02
M
FBbFf Bbff bbFf bbff
BbFf BbFf-BbFf BbFf-Bbff BbFf-bbFf BbFf-bbff
Bbff Bbff-Bbff bbFf-Bbff bbff-Bbff
bbFf bbFf-bbFf bbFf-bbff
bbff bbff-bbff
a. BbFf x bbffb. bbFf x Bbff
31brown, chinchilla
bbff
35black, full
BBFFBBFfBbFFBbFf
16brown, full
bbFFbbFf
19black, chinchilla
BBffBbff
16+19/101=0.3465 34%
Gene 1 Gene 2
a. BbFf x bbff
31brown, chinchilla
bbff
35black, full
BbFf
16brown, full
bbFf
19black, chinchilla
Bbff
Parental Phenotypes Recombinant Phenotypes
31+35/101=0.6534 1-0.6534=0.3466
34%
Gene 1 Gene 2
b. bbFf x Bbff
31brown, chinchilla
bbff
35black, full
BbFf
16brown, full
bbFf
19black, chinchilla
Bbff
Recombinant Phenotypes Parental Phenotypes
Case 1 Study of Single Family
• Proband : 14/M• Chief complaints
– Bilateral sensorineural hearing loss (prelingual)– Impaired vision (since 10 years of age)– Distal muscle weakness
• Family history : Positive & suggestive of X-linked recessive
• NCV & EMG– Mixed sensorimotor polyneuropathy; c/w HMSN type2
• Type of CMT• Genetic Workup• Linkage analysis• Summary
– The present family represents an X-linked recessive CMT with hearing loss and visual impairment without mutations in the GJB1 gene
– Linkage analysis revealed that the disease is linked to a 10-cM interval flanked by DXS990 and DXS8010 on chr Xq21.33 (LOD score : 3.6)