43 b 32 o 59 55 48 51 54 po 45 45 po 32 39 + + breast/ovarian family 22 † 57 † 49

43 43

B 32B 32O 59O 59

55 55 484851515454

POPO 4545

POPO 4545

3232 39 39

++ ++

Breast/Ovarian Family

22

† 57 † 49

Inherited predisposition

More BRCA-like genes

Rare, moderately strong variants

Common genetic variation

Role of normal genetic variation in determining individual risk.

How useful is this information in selection for screening and prevention?

How do we find the genes?

Breast cancer as an example

Evidence that genetic variation affects risk

Measure of variation = familial clustering

Risk in close blood relative compared to risk in population as a whole

= roughly 2-fold.

Is family clustering genetic?

Incidence % per yearMZ twin 1.31

DZ twin 0.5Mother/sister 0.36

Patient’s contralateral breast 0.66

(Peto & Mack, Nat Genet 26, 411 (2000))

How much genetic predisposition is there?How is it distributed?

Determines potential for discriminating individual risks

risk

43 43

B 32B 32O 59O 59

55 55 484851515454

POPO 4545

POPO 4545

3232 39 39

++ ++

Breast/Ovarian Family

22

† 57 † 49

Population

BRCA1/2mutation

OBS EXP Excess

177 106 71

13 1.47 11.5

Fraction of excess familial clustering attributable to BRCA1/2 = 15-20%

Familial clustering of breast cancer

Familial clustering of breast cancer

1

2

Excess familial risk

Roughly 15-20%due to BRCA1/2

ATMChk-2Ha-rasPTEN

Risk to1o relativeof case

What sort of genes may account for familial risk apart from BRCA1/2?

Common low-penetrant genes

BRCA3 etc BRCA1, 2

Allele freq. XsFRR Number Allele freq. XsFRR Number 1% .25 350 0.2% 16 5 10% 2.3 35 30% 5.3 16

1.5 10 Relative risk

Patterns of breast cancer in families

1500 cases, population basedBRCA1/2 excluded

What model fits best?

Best fit = combined result of several factors, individually of small effect

= log-normal distribution of risk

in population.

0.010

0.020

0.030

0.040

0.01 0.10 1.00 10.00 100.00

Relative risk

SD = 1.2

CasesPopulation

Distribution of genotypes inpopulation and cases by

genotype risk

0.000

Proportion of population and cases above specified risk: SD =

1.2P

ropo

rtio

n ab

ove

give

n ris

k (x

)

Risk of breast cancer by age 70

0%

50%

100%

0% 20% 40% 60% 80%

CasesPopulation

88%

10%

12%

46%

3%

Effects of normal genetic variation on breast cancer

risks

Population10% 50%

46% 12%

Cancers

Individual risk by age 70 > 1 : 8 < 1 : 30


0.8

0%

50%

100%

0% 20% 40% 60% 80%

CasesPopulation

Pro

port

ion

abov

e gi

ven

risk

(x)


80%

31%

10%

11%4%


0.3

0%

25%

50%

75%

100%

0% 20% 40% 60% 80%

CasesPopulation

Pro

port

ion

abov

e gi

ven

risk

(x)


Gail model of breast cancer risk Nurses Health Study Analysis

Excellent prediction of breast cancer incidence in specified population.

Poor prediction of risk to individual.

2.8-fold between upper and lower deciles

cut-off for tamoxifen use defined 33% of population with 44% of cases.

(Rockhill, JNCI 93, 358 (2001))

- find genes- interactions- validation

40x

risk

1/5 1/5

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

How to find the genes?

Association studiesarg cys

directindirect

linkage disequilibrium

C T

V

Problems: recombination origins different time

multiple origins

Common variant : common disease Rare variants

MarkerDisease allele

Candidate genes

Estrogen synthesis and degradation; ER

Cell cycle checkpointsDNA repairTGF pathwayIGF pathwayCarcinogen metabolism

Sample setsInitial : 2000 cases, 2000 controlsConfirmatory : 2000 cases, 2000 controls

Cases - Population based, East Anglia simple epidemiology data, survival;

paraffin blocksControls - EPIC cohort, East Anglia

extensive epidemiological data, follow-up, serum, mammography, bone density, etc

(Antoniou & Easton, submitted)

Percentage polygenic variance explained.

0

1000

2000

3000

4000

5000

6000

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Allele Frequency

Sam

ple

size 1%

2%

5%

10%

90% power p = 10-4

multiplicative

Power

Provisional positive associations : breast cancer

98 snps 47 candidate genes

Risk Br Ca Fraction to age 70 of excess

Freq OR PAF (5.7%) RR

TGF 14% 1.25 2.9% 6.8% 0.2%BRCA27% 1.31 2.1% 7.4% 0.3%XRCC3 15% 1.34 4.4% 7.4% 0.5%ER 20% 1.27 5% 6.8% 0.5%

Chk2 0.5% 2.4 0.6% 16% 0.5%

~2.0%

0.1 1 10

Joint NN

Joint NH

Joint HH

UK set 1 HH

UK set 2 HH

UK set 3 HH

HDB HH

Finns HH

p=0.02

BRCA2 N372H association with breast cancer risk

0.1 1 10 100

Tee et al. In prep.

Fiegelson et al. 2001

Haiman et al. 1999

Mitrunen et al. 2000

Kristensen et al. 1999

Spurdle et al. 2000

Miyoshi et al. 2000

Kuligina et al. 2000

Hamajima et al. 2000

Huang et al. 1999

Helzlouler et al. 1998

Weston et al. 1998

Weston et al. 1998

Young et al. 1999

Bergman-Jungestrom et al. 1999

OR breast cancerCYP17 t -34 c (cc Vs. tt)

Conclusion:This SNP has no main effecton breast cancer risk!

Ye & Parry, 2002 Mutagenesis 17:119-126 N

226

230

3133

310

744

1081

Why a p value of p = 0.01 is not persuasive

Prior probability of result (snp causing 1% of FRR, 100,000 snps in genome) 1/1000999/1000

Probability given result has p = 0.01 99/100 1/100

99/100,000 999/100,000

Assuming random choice of ‘candidate’ gene only ~ 10% results at p = 0.01 are true

(~50%, at p = 0.001)

True Falseassociation association

SNP

0.001

0.01

0.10

1.000 10 20 30 40 50 60 70 80 90 100

p-v

alu

e

0.05

observed

chance

Summary of results 96 snps, 47 genes~2000 cases, 2000 controls

p = 0.01/0.0004 for comparison of distributions

0.5 1 1.3 2 relative risk

% of excess FRR explained

Some reasons why human association studies may be

difficultInappropriate genetic models eg rare/multiple alleles

Regulatory vs coding polymorphisms

Numbers : inadequate statistical power

Genetic background effects; interactions weak ‘main effect’, high-order interactions ‘null’ result = balance of susceptible and

resistant on different BG

Phenotypic heterogeneity eg ER+/ER-; histology

Cancer/no cancer endpoint lacks power

Intermediate phenotypes

P homogeneity = 0.0005P trend <0.0001

Serum estradiol and CYP19Exon 10 t>c 3’UTR

10

12

14

16

18

20

tt tc cc

Serum SHBG and SHBGExon 8 g>a or D356N

20

30

40

50

60

gg ga aa

P homogeneity = 0.006P trend = 0.006

(Ponder, Dowsett labs; EPIC; unpublished)

Implications for breast cancer risk

2 fold increase in estradiol 30% increase in risk of breast cancer

tt genotype of CYP19 c>t associated with 14% increase in estradiol: equivalent to 1.04 fold increase in breast cancer risk

Where next?

Empirical vs candidate approaches

Snp genotyping now ~17c/genotype : ? screen 600 “enriched”

cases/600 controls vs 1150 coding snps

~$240,000

Candidate gene approaches

Candidates from cell biology

Epidemiology

Regulatory variants

Quantitative phenotypes

Leads from mouse models

Mouse/human collaborations

1. Candidate susceptibility genes/regions

mapped in susceptible/resistant crossesrefined by amplicons/deletions in tumoursallele-specific differences in expression/somatic change (easier in mouse because extended haplotypes)

loci involved in control of gene regulation

loci influencing intermediate phenotypesset up large cross and score multiple phenotypes

How tightly should the region be defined?

Say 5 genesFirst pass = find all coding region snps at >5%Construct haplotypes, select minimum snp set = ? 30 snps

Genotype 30 snps in 2000 cases/2000 controls = 120,000 genotypes

Genotyping cost ~$20,000 @ 17c/genotype

BUT : currently requires ~1000 snps at a time

300 kb


2. Interactions

Identification of interacting loci potentially approachable in

mouse

Develop and evaluate programmes to search for higher order

interactions;? applicability to man


3. Stages of cancer development

? Distinguish loci that influencemultiplicitylatency; progressioninvasionmetastasis and resistance to

these

? Loci that affect treatment response


4. “End game” - which is the active gene, snp?

strain comparisons of variantsdissection of complex QTLs

transgenic models

“‘Risk factor’ analysis will facilitate environmental modification, screening and therapeutic management of people before they develop symptoms”

(Bell, BMJ 1998)

“Differences in social structure, lifestyle and environment account for much larger proportions of disease than genetic differences …… Those who make medical and scientific policies ….. would do well to see beyond the hype”

(Holtzman & Marteau, NEJM 2000)

A new horizon in medicine?

Strangeways Research Laboratories - University of Cambridge

Bruce Ponder Doug EastonPaul Pharoah Antonis Antoniou UCSFAlison Dunning Mitul Shah Allan BalmainFabienne Lesueur Julian Lipscombe Mandy TolandBettina Kuschel Joe GrayAnnika Auranen Nick Day; EPIC Mark SternlichtKatie Healey NCICraig Luccarini Kent HunterJenny He Louise Tee Biochemistry, CambridgeGary Dew Jim Metcalfe

Cancer Research UK; MRC

TGF

t/c Pro/Leu

-509 10

t P

c P

c L

0.25

0.11

0.60

PP vs LL OR 1.25 (1.1 - 1.4) p = 0.01

tt vs cc OR 1.30 (1.1 - 1.5) p = 0.01

Which SNP is the functional variant?

0.1

cc LeuLeu

cc LeuPro

ct LeuPro

cc ProPro

ct ProPro

tt ProPro

1.0 10Odds Ratio

Pro10 homozygoteshave increased riskregardless of c-509tgenotype

TGF in vitro secretion

0

1

2

3

4

TGF1ng/ml

untransfected

cells

CM

V-E

+ßgalC

MV

-LC

MV

-L+CM

V-E

CM

V-L+ßgal

CM

V-P

CM

V-P

+CM

V-E

CM

V-P

+ßgal

181260

hours

Time CourseEnd Point

Leu10

Pro10Ratio P:L

(Metcalfe, Ponder labs, 2002)

Funnel Plot For TGF L10P

0.1 1 10

Frei

Ziv et al.

Hishido et al.

Finn

HDB

ABC

OR (PP Vs. LL)

N

238

3075

404

939

875

4517

** Cohort study

146 cases 2929 controls

43 b 32 o 59 55 48 51 54 po 45 45 po 32 39 + + breast/ovarian family 22 † 57 † 49

Documents

relative risk slide

specified risk

individual risks risk

genotype risk

familial clustering

excess familial risk

relative risk sd

risk measure of variation