cell line models for genome wide association mapping in ......cell line models for genome wide...

40
Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD Bioinformatics Research Center Department of Statistics

Upload: others

Post on 06-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Cell Line Models for Genome Wide Association Mapping in Cancer Drug ResponseAlison Motsinger-Reif, PhDBioinformatics Research CenterDepartment of Statistics

Page 2: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Introduction• Understanding variability in individual response to

drug/chemical exposure is a key goal of pharmacogenomics and toxicogenomics

• Goals of gene mapping:– Find efficient predictors of response (efficacy,

toxicity,potency, etc.)– Dissect the underlying mechanisms of differential

response

Page 3: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Challenges in Dose Response Genetics

• Study design limitations– Clinical trials– Rarely have family data– Limited sample size– Limited replication opportunities….

• Limits ability to test basic genetic assumptions– Are these traits heritable?– Is this actually a genetics problem?

Page 4: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Challenges in Dose Response Genetics

• Study design limitations– Clinical trials– Limited number of – Rarely have family data– Limited sample size, replication opportunities….

• Limits ability to test basic genetic assumptions– Are these traits heritable?– Is this actually a genetics problem?

Page 5: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Challenges in Dose Response Genetics

• Study design limitations– Clinical trials– Limited number of – Rarely have family data– Limited sample size, replication opportunities….

• Limits ability to test basic genetic assumptions– Are these traits heritable?– Is this actually a genetics problem?

High-throughput in vitro assaysof dose response can help assess the

heritability of dose response and perform well-powered linkage and association analysis.

Page 6: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Current Uses of the Model

• Cytotoxicity mapping for chemotherapy– Cytotoxics– Monoclonal antibodies

• Evaluation of methods for capturing dose response associations

• Use of high throughput methodology for chemical exposure

Page 7: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Assay Methodology• Alamar blue viability assays• 6 point dose response curves• Immortalized lymphoblastoid cell lines

Page 8: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Use of the Model

• We are using this model to interrogate genetic predictors of drug response for 45 chemotherapy drugs

• Heritability assessed with family-based samples– CEPH cell lines

• Mapping in unrelated cohorts– CHORI cohort– 1000 Genomes

44

Page 9: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Use of the Model

• We are using this model to interrogate genetic predictors of drug response for 45 chemotherapy drugs

• Heritability assessed with family-based samples– CEPH cell lines

• Mapping in unrelated cohorts– CHORI cohort– 1000 Genomes

44

Lots of methods challenges in here along the way…

Page 10: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Variation in Cellular Sensitivity

• Typical dose response curves

Page 11: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Heritability Calculations• Variance components analysis

as implemented in MERLIN 1.1.2– http://www.sph.umich.edu/c

sg/abecasis/Merlin/index.html

• h2 of the growth rate for each vehicle was calculated

• h2 adjusted for the growth rate for the appropriate vehicle by using growth rate as a covariate

Page 12: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

GWAS Study• Genome-wide association (GWAS) studies for

highly heritable drugs

• Children’s Hospital of Oakland Research Institute (CHORI) population based cohorts used to generate cell line– 520 samples

• 650K SNP-chip data available for mapping– Simulation experiments to prepare for association

mapping– Imputed to ~2 million variants

Page 13: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Association Mapping

• Previous studies have looked at fitting curves and then doing simple association tests on genotypes versus these values:– EC/IC50– Hillslope

• These choices make LOTS of assumptions– Assumptions about how associations may be happening– Need methods that don’t make these assumptions

Page 14: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Complicated Response CurvesDifferences between phenotypes could be manifested in many ways.

Page 15: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Modeling Robustly• The vector or responses across concentration were modeled

jointly using multivariate analysis of covariance (MANCOVA):

• Minimal modeling assumptions– No assumptions made about the form of dose response curves or how

these curves vary between genotypes– The assumptions of multivairate normality seems reasonable in real data

Introduction Preliminary Work Methods Results Conclusions Future Directions Thanks

Modeling robustlyThe vector of responses across concentration were modeled jointlyusing multivariate analysis of covariance (MANCOVA):

yij = ↵ + µi + Xij� + Eij ,

I where Eijiid⇠ N(0,⌃),

I yij is the vector of responses for the j th LCL with genotype i ,

I Xij contains confounding covariates,

I and µi is the vector of e↵ects due to genotype i .

Minimal modeling assumptions

I No assumptions made about the form of dose-response curvesor how these curves vary between genotypes

I The assumption of multivariate normality seems reasonable inreal data [4]

27 / 62

Page 16: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Simulation Study Results• MANCOVA has most power to detect real signals (top) and is

most robust for hill slope alternatives (bottom)

Page 17: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

MAGWAS• Multivariate Analysis of covariance Genome-Wide

Analysis Association Software• Designed for GWAS having multivariate responses• Allows for incorporation of covariates• Command line based, platform independent• Accepts data in PLINK format• Computationally efficient

– “typical” GWAS in 2-20 minutes

Page 18: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Association Results

Page 19: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Drug Families

Page 20: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Association by Drug Family

• Each dose response curve was summarized by the mean viability across drug concentrations

• MANCOVA was used to jointly model the mean viabilities across drug families

• Information is combined across drug family

• Small differences can become detectable, even if not present for each drug individually

Page 21: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Association by Drug Family

• Locus rs11639947 is associated (p < 10−6) with response to the alkylating class (temozolomide and mitomycin), and is located upstream of NIN1/RPN12 binding protein 1 homolog (NOB1)

• Polymorphisms on NOB1 have been found to be associated with myelotoxicity in malignant glioma patients treated with temozolomide

Drug Chrom. rsID � log10(p) Gene(s) nearby1 Carboplatin 5 rs1982901 6.23 None2 Cytarabine 3 rs12637988 6.21 MED12L / P2RY123 Daunorubicin 9 rs7867736 6.34 None4 Etoposide 22 rs2076112 6.47 PLA2G65 Fluorouracil 7 rs2270311 6.23 CHN26 Fluorouracil 15 rs10152957 6.05 MEGF117 Gemcitabine 2 rs4851774 6.26 FHL28 Gemcitabine 3 rs513659 6.31 None9 Idarubicin 2 rs7582313 6.87 None10 Mitoxantrone 10 rs7068798 6.05 dC10orf6711 Oxaliplatin 8 rs2897377 6.07 CSMD112 Oxaliplatin 17 rs1808918 6.01 dGNA1313 Paclitaxel 1 rs1338990 6.33 None14 Paclitaxel 4 rs306005 6.47 SPATA515 Paclitaxel 5 rs31878 6.86 None16 Temozolomide 10 rs531572 15.48 MGMT17 Teniposide 22 rs8138023 6.30 uNUP5018 Topotecan 6 rs11966294 6.29 DDO

Table 4: Single nucleotide polymorphisms (SNPs) most associated with drug response, for eachdrug separately. Superscripts u and d indicate that genes are located within 100kpb upstreamor downstream of the SNP, respectively.

Drug Class Chrom. rsID � log10(p) Gene(s) nearby1 DNA Alkylating Agents 2 rs7581424 6.13 uHDAC42 DNA Alkylating Agents 16 rs11639947 6.55 dNOB1 / dWWP23 uNQO1 / uNFAT54 Platinum Agents 10 rs10821910 7.84 C10orf1075 TK Inhibitors 10 rs10762827 7.95 None

Table 5: Single nucleotide polymorphisms (SNPs) most associated with drug response, for drugfamily. Superscripts u and d indicate that genes are located within 100kpb upstream or down-stream of the SNP, respectively. TK stands for tyrosine kinase.

Drug Class Drugs1 Nucleosides gemcitabine cytarabine cladaribine fludaribine azacitidine2 TK inhibitors dasatinib sunitinib3 Tubulin binding agents docetaxol pacitaxol vinblastine vincristine vinorelbine4 DNA alkylating agents mitomycin temozolomide5 Platinum agents carboplatin oxaliplatin6 Anthracyclines daunorubicin doxorubicin epirubicin idarubicin mitoxantrone7 Fluoropyrimidines floxuridine 5-fluorouracil8 Epipodophylotoxins etoposide teniposide

Table 6: Drug family membership for 25 anticancer agents. TK stands for tyrosine kinase.

27

Page 22: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Drug Clustering

• Can LCLs be used to predict drug families? • Distance metrics between each pair of drugs were calculated

from their vectors of viabilities

– Yai and Ybi are viabilities for the ith LCL for drugs a and b– Xi is the matrix of covariates

• Distance between drugs a and b was estimated as one minus the average partial r-squared for a regressed on b and b regressed on a.

the vector of normalized responses jointly provides more information than a single summarymeasure, such as half-maximal inhibitory concentrations (IC50). Simulation studies have shownthis method to be both robust to di↵erences in dose-response profiles between genotypes andpowerful in the detection of real biological signals [6, 5]. The model used in association for anydrug d at an SNP s is:

Yij = Xij� + µi + eij (2)

eij ⇠ Np(0,⌃),

where Yij is the vector of normalized responses (across the six concentrations for d) for the jth

individual having genotype i on s, Xij is the matrix of covariates for the first two PCs, tem-perature, growth rate and experimental batch and µi is the vector of parameters modeling thee↵ects of genotype i of s on d. Also, Np(0,⌃) is the multivariate normal distribution, for vectorsof length p = 6 and with mean 0 and variance ⌃. The significance of estimates for µi wereassessed using Pillai’s trace [15]. All computations were performed using the software programMAGWAS [8]. Because association tests rely on large sample asymptotic theory, only those lociwhich had at least 20 individuals in each genotype group were kept for association mapping.This left 1,278,133 SNPs for all drugs except 5-fluorouracil (971,593) and nilotinib (783,013).

Association tests were also performed for each drug family, as described in Table 6. For this,the mean normalized viability across each dose-response curve was calculated for every LCL andevery drug. In this way, the model used in association for any drug family d at an SNP s isalso uses Equation 2, where Yij now represents the vector of mean normalized viabilities (acrossall drugs in family d) for the jth individual having genotype i on s. All other variables fromEquation 2 remain the same, and p now equals the number of drugs in family d.

Drug clustering

Distance metrics between each pair of drugs were calculated from their vectors of normalizedviabilities, like in the association study. Specifically, the distance between drugs a and b werecalculated by first fitting:

Yai = Ybi� + Xi�,

where Yai and Ybi are the vectors of normalized viabilities for the ith LCL for drugs a and b,respectively, and Xi is the same corresponding matrix of covariates used in association mapping(the first two PCs, experimental batch, temperature and growth rate). The coe�cient of partialdetermination (partial r-squared) of Ybi in predicting Yai after controlling for Xi was calculatedfor all possible pairs (a, b). Distance between drugs a and b was estimated as one minus theaverage partial r-squared for a regressed on b and b regressed on a. In this way, it was notpossible to include both 5-fluorouracil and nilotinib, since each cell line was exposed to exactlyone of these agents. For this reason, and because nilotinib had lower laboratory replicability(see Table 2), nilotinib was removed from clustering. All other (28) drugs were clustered usingthe distance metric described above using no a priori knowledge of drug family. Clusteringwas performed using the matrix of distance metrics between all pairs, described above, and the“hclust” function, with the argument “method=ward” from the R statistical package [14].

17

Page 23: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Empirical support for Drug Clustering

Page 24: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

MGMT and Temozolomide• Proof of concept that LCLs can identify clinically significant genes

in cancer drug efficacy.

• Manhattan plot for Temozolomide– The large red peak is for locus rs477693, located in the gene

coding for MGMT (O6-methylguanine–DNA methyltransferase), a protein known to be associated with Temozolomide efficacy [Hegi et al., 2005].

Page 25: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Gene Expression and MGMT• MGMT repairs DNA that has been damages (methylated), helping

prevent cell death

• MGMT expression is also associated with rs531572

Page 26: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Clinical Validation• Moffitt Cancer Center clinical trial

– 437 patients with high grade glioma

– 318 on standard of care (SOC)• SOC

– Resection plus radiation plus temozolomide

Group N deathsAdditive Genotypic

HR (95% CI)* p-value HR (95% CI)* p-

valueall patients 437 375 0.93 (0.80, 1.09) 0.369 0.88 (0.70, 1.11) 0.293SOC, male 200 171 0.79 (0.63, 0.99) 0.040 0.73 (0.52, 1.01) 0.057SOC, female 118 94 1.14 (0.85, 1.53) 0.395 1.20 (0.75, 1.95) 0.448

• Evaluated 7 SNPs in linkage disequilibrium with this hit

• Looked at overall survival• Rs477692 – top hit

Page 27: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Monoclonal Antibodies

• Used the LCL model for testing new class of drugs (anti-CD20):– Rituximab– Ofitumumab

• Used the C’EPH Pedigrees for linkage analysis– Found 2 large peaks for followup – Chr 3, Chr 12– Overlapped for both drugs

50

Page 28: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Monoclonal Antibodies

• To narrow down genes– Gene expression data– 57 C’EPH cell lines with

available expression– For genes in the region,

• One Gene: CBLB– CBLB encodes an E3 ubitquitin ligase– Involved in T-Cell and B-Cell receptor downregulation– CBLB loss provokes autoimmunity via loss of

autoregulatory mechanisms51

Page 29: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Functional Validation

• Rituximab’s target is CD20• Tested whether knocking down

CBLB changes CD20 expression

Immunofluorescence assay showing CD20 localizationCD20 Gene Expression is not altered by CBLB

Knockdown 52

Page 30: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Lessons from the LCL models

• LCLs are a promising approach for dose response mapping:– Allow for research that is not possible with human subjects– High throughput means that QC, both genotypic and

phenotypic, is important– Typical association methods may not capture the full array

of potential differential response– Support for known drug/chemical classes – Dose response models seem as “complex” as complex trait

mapping always is…

Page 31: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Other Methods Development Challenges Along the Way

• Dose response modeling– EADRM– Beam A, Motsinger-Reif A. Beyond IC50s: Towards Robust Statistical Methods for in vitro Association Studies. J

Pharmacogenomics Pharmacoproteomics. 2014 Mar 1;5(1):1000121.– Beam AL, Motsinger-Reif AA. Optimization of nonlinear dose- and concentration-response models utilizing

evolutionary computation. Dose Response. 2011;9(3):387-409.

• Extending approaches for accurate permutation testing

– Che R, Jack JR, Motsinger-Reif AA, Brown CC. An adaptive permutation approach for genome-wide association study: evaluation and recommendations for use. BioData Min. 2014 Jun 14;7:9. doi: 10.1186/1756-0381-7-9. eCollection 2014.

Page 32: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Current Work

• Tyrosine Kinase Inhibitors• Continued follow up of top hits• Evaluating the model for PD1K inhibitors• Exploring analysis methods to combine results across drugs

– Pathway analysis– Cross-heritability

• Additional methods to build more complex models– Gene-gene, gene-drug interactions– Genomic prediction approaches and Bayesian variable selection– Advances in permutation testing implementations

Page 33: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Current Work

• Drug combinations– Chemotherapies are rarely given alone– Modeling mixtures is a real challenge– Evaluating methods for quantifying synergy

• Add inference to Chou-Talalay method

Page 34: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Synergy in the LCL Model• Pilot Study

• 8 different drugs/ 6 concentrations• 7 combinations of mixtures tested• 123 LCLs • Contains 45 trios (a set of parents and single child)• Hypothesis: synergy/antagonism is quantifiable in vitro and

heritable

Page 35: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Synergy in the LCL Model

Page 36: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Synergy in the LCL Model

Page 37: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Genetic Etiology of Synergy

Page 38: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

Summary• In vitro assays can be used to assess the genetic component

of dose response traits and to perform well-powered GWAS.

• Such new models take careful consideration and experimentation with new statistical approaches to answer biological questions.

Biology

MethodsDevelopment

Page 39: Cell Line Models for Genome Wide Association Mapping in ......Cell Line Models for Genome Wide Association Mapping in Cancer Drug Response Alison Motsinger-Reif, PhD ... Challenges

NCSUDaniel RotroffKyle RoellJohn JackChad BrownFred WrightPaul GallinsYihiu ZhouDavid Reif

UNC Chapel HillTammy HavenerTim WiltshireEric PetersMichael WagnerKristy RichardsPaul GallinsNour Abdo

Funding: National Cancer Institute: R01 CA161608

MoffittHoward McLeodKathleen Egan

CHORIRon KraussMarisa Wong-Medina

Acknowledgments