power in qtl linkage analysis

71
Power in QTL linkage analysis Shaun Purcell & Pak Sham SGDP, IoP, London, UK F:\pshaun\power.ppt

Upload: uriah-hale

Post on 03-Jan-2016

59 views

Category:

Documents


2 download

DESCRIPTION

Power in QTL linkage analysis. Shaun Purcell & Pak Sham SGDP, IoP, London, UK. F:\pshaun\power.ppt. YES. NO. Test statistic. Power primer. Statistics (e.g. chi-squared, z-score) are continuous measures of support for a certain hypothesis. YES OR NO decision-making : significance testing. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Power in QTL linkage analysis

Power in QTL linkage analysis

Shaun Purcell & Pak Sham

SGDP, IoP, London, UK

F:\pshaun\power.ppt

Page 2: Power in QTL linkage analysis

Power primer

Statistics (e.g. chi-squared, z-score) are continuous

measures of support for a certain hypothesis

Test statistic

YES OR NO decision-making : significance testingInevitably leads to two types of mistake :

false positive (YES instead of NO) (Type I)false negative (NO instead of YES) (Type II)

YESNO

Page 3: Power in QTL linkage analysis

Hypothesis testing

Null hypothesis : no effect

A ‘significant’ result means that we can reject the

null hypothesis

A ‘nonsignificant’ result means that we cannot

reject the null hypothesis

Page 4: Power in QTL linkage analysis

Statistical significance

The ‘p-value’

The probability of a false positive error if the null

were in fact true

Typically, we are willing to incorrectly reject the null

5% or 1% of the time (Type I error)

Page 5: Power in QTL linkage analysis

Misunderstandings

p - VALUES

that the p value is the probability of the null

hypothesis being true

that high p values mean large and important

effects

NULL HYPOTHESIS

that nonrejection of the null implies its truth

Page 6: Power in QTL linkage analysis

Limitations

IF A RESULT IS SIGNIFICANT

leads to the conclusion that the null is false

BUT, this may be trivial

IF A RESULT IS NONSIGNIFICANT

leads only to the conclusion that it cannot be

concluded that the null is false

Page 7: Power in QTL linkage analysis

Alternate hypothesis

Neyman & Pearson (1928)

ALTERNATE HYPOTHESIS

specifies a precise, non-null state of affairs with

associated risk of error

Page 8: Power in QTL linkage analysis

P(T)

T

Critical value

Sampling distribution if HA were true

Sampling distribution if H0 were true

Page 9: Power in QTL linkage analysis

Rejection of H0 Nonrejection of H0

H0 true

HA true

Type I error at rate

Type II error at rate

Significant result

Nonsignificant result

POWER =(1- )

Page 10: Power in QTL linkage analysis

Power

The probability of rejection of a false null-

hypothesis

depends on - the significance crtierion ()- the sample size (N) - the effect size (NCP)

“The probability of detecting a given effect size in a population from a sample of size N, using significance criterion ”

Page 11: Power in QTL linkage analysis

Impact of alpha

P(T)

T

Critical value

Page 12: Power in QTL linkage analysis

Impact of effect size, N

P(T)

T

Critical value

Page 13: Power in QTL linkage analysis

Applications

POWER SURVEYS / META-ANALYSES- low power undermines the confidence that can be

placed in statistically significant results

INTERPRETING NONSIGIFICANT RESULTS- nonsignficant results only meaningful if power is high

EXPERIMENTAL DESIGN- avoiding false positives vs. dealing with false negatives

MAGNITUDE VS. SIGNIFICANCE- highly significant very important

Page 14: Power in QTL linkage analysis

Practical Exercise 1

Calculation of power for simple case-control

association study.

DATA : allele frequency of “A” allele for cases and

controls

TEST : 2-by-2 contingency table : chi-squared

(1 degree of freedom)

Page 15: Power in QTL linkage analysis

Step 1 : determine expected chi-squared

Hypothetical allele frequencies

Cases P(A) = 0.68

Controls P(A) = 0.54

Sample 150 cases, 150 controls

Excel spreadsheet : faculty drive:\pshaun\chisq.xls

Chi-squared statistic = 12.36

Page 16: Power in QTL linkage analysis

P(T)

T

Critical value

Step 2. Determine the critical value for a given type I error rate,

- inverse central chi-squared distribution

Page 17: Power in QTL linkage analysis

http://workshop.colorado.edu/~pshaun/gpc/pdf.html

df = 1 , NCP = 0

X

0.05

0.01

0.001

3.84146

6.63489

10.82754

Page 18: Power in QTL linkage analysis

P(T)

T

Critical value

Step 3. Determine the power for a given critical valueand non-centrality parameter

- non-central chi-squared distribution

Page 19: Power in QTL linkage analysis

Determining power

df = 1 , NCP = 12.36

X Power

0.05 3.84146

0.01 6.6349

0.001 10.827

0.94

0.83

0.59

Page 20: Power in QTL linkage analysis

Exercises

Using the spreadsheet and the chi-squared

calculator, what is power (for the 3 levels of

alpha)

1. … if the sample size were 300 for each group?

2. … if allele frequencies were 0.24 and 0.18 for

750 cases and 750 controls?

Page 21: Power in QTL linkage analysis

Answers

1. NCP = 24.72 Power

0.05 1.00

0.01 0.99

0.001 0.95

2. NCP = 16.27 Power

0.05 0.98

0.01 0.93

0.001 0.77

nb. Stata : di 1-nchi(df,NCP,invchi(df,))

Page 22: Power in QTL linkage analysis

QTL linkage

POWER

Type I errors

Type II errors

Sample N

Effect Size

Allele frequencies

Genetic valuesVariance explained

Page 23: Power in QTL linkage analysis

Power of tests

For chi-squared tests on large samples, power is

determined by non-centrality parameter () and

degrees of freedom (df)

= E(2lnL1 - 2lnL0)

= E(2lnL1 ) - E(2lnL0)

where expectations are taken at asymptotic values

of maximum likelihood estimates (MLE) under an

assumed true model

Page 24: Power in QTL linkage analysis

Linkage test

HA

H0

SVV

NSDA

ijNV

VVVVDA

42

for i=j

for ij

SDA

NSDAijL VVzV

VVVV

ˆ

for i=j

for ij

xxL 1lnln2

Page 25: Power in QTL linkage analysis

Expected log likelihood under H0

s

xxELE

N

NN

ln

ln)ln2( 10

Expectation of the quadratic product is simply s, the

sibship size

(note: standarised trait)

Page 26: Power in QTL linkage analysis

Expected log likelihood under HA

sP

sE

xxEELE

i

m

ii

L

LL

1

11

ln

ln

ln)ln2(

Page 27: Power in QTL linkage analysis

Linkage test

m

iiiP

10 lnln

For sib-pairs under complete marker information

241

121

041

0 lnlnlnln

Expected NCP

)1ln(

)1ln(4

1)1ln(

2

1)1ln(

4

1

2

22

21

20

S

L

r

rrr

Determinant of 2-by-2 standardised covariance matrix = 1 - r2

Page 28: Power in QTL linkage analysis

Approximation of NCP

),()()()1(

)1(

2

)1(

)()1(

)1(

2

)1(

2222

2

22

2

zCovVVzVarVVarVr

rss

rVarr

rssNCP

DADA

NCP per sib pair is proportional to

- the # of pairs in the sibship(large sibships are powerful)

- the square of the additive QTL variance(decreases rapidly for QTL of v. small effect)

- the sibling correlation(structure of residual variance is important)

Page 29: Power in QTL linkage analysis

QTL linkage

POWER

Type I errors

Type II errors

Sample N

Effect Size

Allele frequencies

Genetic valuesVariance explained

Marker vs functional variant

Recombinationfraction

Page 30: Power in QTL linkage analysis

Incomplete linkage

The previous calculations assumed analysis was

performed at the QTL.

- imagine that the test locus is not the QTL

but is linked to it.

Calculate sib-pair IBD distribution at the QTL,

conditional on IBD at test locus,

- a function of recombination fraction

Page 31: Power in QTL linkage analysis

22 )1(

2 )1(2 2)1(

2)1(2 2)1(

)1( )1( )1(21

0

1/2

1

0 1/2 1

at QTL

at M

Page 32: Power in QTL linkage analysis

Use conditional probabilities to calculate the sib

correlation conditional on IBD sharing at the test

marker. For example : for IBD 0 at marker :

0 1/2 1

at QTL

2 )1(2 2)1( P(M=0 | QTL)

r VS VA / 2 + VS VA + VD + VS

C0 = VS2 VA / 2 + VS)1(2 +

VA + VD + VS

2)1( +

Page 33: Power in QTL linkage analysis

The noncentrality parameter per sib pair is then

given by

)1ln(

)1ln(4

1)1ln(

2

1)1ln(

4

1

2

22

21

20

S

L

r

ccc

Page 34: Power in QTL linkage analysis

If the QTL is additive, then

attenuation of the NCP is by a factor of (1-2)4

= square of the correlation

between the proportions of alleles IBD

at two loci with recombination fraction

Page 35: Power in QTL linkage analysis

Effect of incomplete linkage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5

Recombination fraction

Att

en

ua

tio

n i

n N

CP

Page 36: Power in QTL linkage analysis

Effect of incomplete linkage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100

Map distance cM

Att

en

ua

tio

n i

n N

CP

Page 37: Power in QTL linkage analysis

Comparison to H-E

Amos & Elston (1989) H-E regression

- 90% power (at significant level 0.05)

- QTL variance 0.5

- marker and major gene are completely linked

320 sib pairs

778 sib pairs if = 0.1

Page 38: Power in QTL linkage analysis

GPC input parameters

Proportions of variance

additive QTL variance

dominance QTL variance

residual variance (shared / nonshared)

Recombination fraction ( 0 - 0.5 )

Sample size & Sibship size ( 2 - 5 )

Type I error rate

Type II error rate

Page 39: Power in QTL linkage analysis

GPC output parameters

Expected sibling correlations

- by IBD status at the QTL

- by IBD status at the marker

Expected NCP per sibship

Power

- at different levels of alpha given sample size

Sample size

- for specified power at different levels of alpha given power

Page 40: Power in QTL linkage analysis

From GPC

Modelling additive effects only

Sibships Individuals

Pairs 265 (320) 530

Pairs ( = 0.1) 666 (778) 1332

Trios ( = 0.1) 220 660

Quads ( = 0.1) 110 440

Quints ( = 0.1) 67 335

Page 41: Power in QTL linkage analysis

Practical Exercise 2

What is the effect on power to detect linkage of :

1. QTL variance?

2. residual sibling correlation?

3. marker-QTL recombination fraction?

Page 42: Power in QTL linkage analysis

Pairs required (=0, p=0.05, power=0.8)

0

50000

100000

150000

200000

250000

300000

0 0.1 0.2 0.3 0.4 0.5

QTL variance

N

Page 43: Power in QTL linkage analysis

Pairs required (=0, p=0.05, power=0.8)

1

10

100

1000

10000

100000

1000000

0 0.1 0.2 0.3 0.4 0.5

QTL variance

log

N

Page 44: Power in QTL linkage analysis

Effect of residual correlation

QTL additive effects account for 10% trait variance

Sample size required for 80% power (=0.05)

No dominance

= 0.1

A residual correlation 0.35

B residual correlation 0.50

C residual correlation 0.65

Page 45: Power in QTL linkage analysis

Individuals required

0

5000

10000

15000

20000

25000

Pairs Trios Quads Quints

r=0.35

r=0.5

r=0.65

Page 46: Power in QTL linkage analysis

Selective genotyping

ASP Extreme Discordant

Maximally DissimilarProband Selection EDAC

EDAC

Unselected

Mahanalobis Distance

Page 47: Power in QTL linkage analysis

Selective genotyping

The power calculations so far assume an

unselected population.

- calculate expected NCP per sibship

If we have a sample with trait scores

- calculate expected NCP for each sibship

conditional on trait values

- this quantity can be used to rank order

the sample for genotying

Page 48: Power in QTL linkage analysis

Sibship informativeness : sib pairs

-4 -3 -2 -1 0 1 2 3 4Sib 1 trait -4

-3-2

-10

12

34

Sib 2 trait

00.20.40.60.8

11.21.41.6

Sibship NCP

Page 49: Power in QTL linkage analysis

Sibship informativeness : sib pairs

-4 -3 -2 -1 0 1 2 3 4Sib 1 trait -4

-3-2

-10

12

34

Sib 2 trait

0

0.5

1

1.5

2

Sibship NCP

-4 -3 -2 -1 0 1 2 3 4Sib 1 trait -4

-3-2

-10

12

34

Sib 2 trait

0

0.5

1

1.5

2

Sibship NCP

-4-3

-2-1

01

23

4Sib 1 trait

-4-3

-2-1

01

23

4

Sib 2 trait

0

0.5

1

1.5

2

Sibship NCP

unequal allelefrequencies

dominance

rarerecessive

Page 50: Power in QTL linkage analysis

Selective genotypingASP MaxDPS ED EDAC MDis SEL BSEL T

p d/a

.5 0

.1 0

.25 0

.1 1

.25 1

.5 1

.75 1

.9 1

15.82

17.10

15.45

16.88

15.76

18.89

27.64

43.16

Page 51: Power in QTL linkage analysis

Impact of selection

Page 52: Power in QTL linkage analysis

QTL linkage

POWER

Type I errors

Type II errors

Sample N

Effect Size

Allele frequencies

Genetic valuesVariance explained

PICLocus informativeness

Marker vs functional variant

Recombinationfraction

Page 53: Power in QTL linkage analysis

Indices of marker informativeness:

Markers should be highly polymorphic

- alleles inherited from different sources are likely

to be distinguishable

Heterozygosity (H)

Polymorphism Information Content (PIC)

- measure number and frequency of alleles at a

locus

Page 54: Power in QTL linkage analysis

Heterozygosity

n = number of alleles,

pi = frequency of the ith allele.

H = probability that an individual is heterozygous

n

iipH

1

21

Page 55: Power in QTL linkage analysis

Heterozygosity

Allele Frequency1 0.202 0.353 0.054 0.40

Genotype Frequency11 0.0412 0.1413 0.0214 0.1622 0.122523 0.03524 0.2833 0.002534 0.0444 0.16

Genotype Frequency1112 0.1413 0.0214 0.162223 0.03524 0.283334 0.0444

H = 0.675 675.0

4.005.0

35.02.01

1

22

22

1

2

n

iipH

Page 56: Power in QTL linkage analysis

Polymorphism information content

IF a parent is heterozygous,

their gametes will usually be informative.

BUT if both parents & child are heterozygous for

the same genotype,

origins of child’s alleles are ambiguous

IF C = the probability of this occurring,

PIC = H - C

Page 57: Power in QTL linkage analysis

Polymorphism information content

n

i

n

ijji

n

ii ppp

CHPIC

1 1

22

1

2 21

Page 58: Power in QTL linkage analysis

Possible IBD configurations given parental genotypes

ConfigurationParental Mating

Type Probabilityz

1 Hom Hom 1/4 1/2 (1-H)2

2 Hom Het 0 1/4 H(1-H)

3 Hom Het 1/2 3/4 H(1-H)

4 Het Het 0 1/2 H2 / 2

5 Het Het 0 0 (H2 -C)/4

6 Het Het 1 1 (H2 -C)/4

7 Het Het 1/2 1/2 C/2

Page 59: Power in QTL linkage analysis

PIC & NCP for linkage

From the table of possible IBD configurations given

parental genotypes,

Therefore, NCP is attenuated in proportion to PIC

PICVarVar )()ˆ(

Page 60: Power in QTL linkage analysis

QTL linkage

POWER

Type I errors

Type II errors

Sample N

Effect Size

Allele frequencies

Genetic valuesVariance explained

PICLocus informativeness

Recombinationfraction

Marker vs functional variant

Marker density

MPIC

Multipoint

Page 61: Power in QTL linkage analysis

Multipoint IBD

Estimates IBD sharing at any arbitrary point along a

chromosomal region, using all available marker

information on a chromosome simultaneously.

Page 62: Power in QTL linkage analysis

, , and PIC

8)( 1

1

PICVar

8),( 1

11

PICCov

8

)21(),(

2

2121

PICPICCov

8

)21(),(

2

11

PICCov t

^

Page 63: Power in QTL linkage analysis

5cM 5cM 5cM 5cM

M10.10.20.7

M20.20.20.20.20.2

M30.1 0.10.1 0.10.1 0.20.1 0.2

M40.20.10.10.20.20.2

M50.20.20.20.20.2

1. Calculate PIC for each marker

PIC 0.41 0.77 0.84 0.79 0.77

5 cM --> = 0.04758

2. Convert map distances into recombination fractions

Haldane map function (m = map distance in Morgans)

2

1 2me

Page 64: Power in QTL linkage analysis

3. Calculate covariance matrixbetween pi-hat at markers

8

)21(),(

2

jiji PICPICCov

M1 M2 M3 M4 M5M1 0.051 0.032 0.035 0.033 0.032M2 0.032 0.096 0.066 0.062 0.061M3 0.035 0.066 0.105 0.068 0.066M4 0.033 0.062 0.068 0.099 0.062M5 0.032 0.061 0.066 0.062 0.096

MM =

8)( 1

1

PICVar

Page 65: Power in QTL linkage analysis

At each position along the chromosome, calculate

covariance between trait locus and each of the markers

MD1015202530

RF0.0910.1300.1650.1970.226

PIC0.410.770.840.790.77

5cM 5cM 5cM 5cM

M1 M2 M3 M4 M5

10cM15cM

20cM25cM

30cM

8)21(),( 2 i

ti

PICCov

0.03440.05280.04720.03630.0290

MT =

4. Consider each multipoint position

Page 66: Power in QTL linkage analysis

Fulker et al multipoint

If is a vector of single marker IBD estimates then

a multipoint IBD estimate at test position t is

given by :

Conditional on the variance of at the test

position is reduced by a quantity which can be

thought of as a multipoint PIC

πˆ 1 MMMTt

π

π

Page 67: Power in QTL linkage analysis

5. Calculate MPIC

0

20

40

60

80

100

-10 0 10 20 30

Map Distance (cM)

MP

IC (

%)

MTMMMTMPIC 1

Page 68: Power in QTL linkage analysis

10 cM map

20

30

40

50

60

70

80

90

100

-10 10 30 50 70 90 110

Position of Trait Locus (cM)

Mu

ltip

oin

t P

IC (

%)

10 cM

Page 69: Power in QTL linkage analysis

5 cM map

20

30

40

50

60

70

80

90

100

-10 10 30 50 70 90 110

Position of Trait Locus (cM)

Mu

ltip

oin

t P

IC (

%)

5 cM

10 cM

Page 70: Power in QTL linkage analysis

Exclusion mapping

Exclusion : support for the hypothesis that a QTL

of at least a certain effect is absent at that

position

Normally, the LRT compares the likelihood at the

MLE and the null

In exclusion mapping, the LRT compares the

likelihood of a fixed effect size against the null

and therefore can be negative

Page 71: Power in QTL linkage analysis

Conclusions

Factors influencing power

QTL variance

Sib correlation

Sibship size

Marker informativeness

Marker density

Phenotypic selection