in silico prescription of anticancer drugs to cohorts of...

16
Article In Silico Prescription of Anticancer Drugs to Cohorts of 28 Tumor Types Reveals Targeting Opportunities Graphical Abstract Highlights d Driver genes are comprehensively identified across a large pan-cancer cohort d In silico prescription links approved or experimental targeted therapies to patients d Up to 73.3% of patients could benefit from agents in clinical stages d 80 therapeutically unexploited targetable cancer driver genes are identified Authors Carlota Rubio-Perez, David Tamborero, ..., Abel Gonzalez-Perez, Nuria Lopez-Bigas Correspondence [email protected] In Brief Using a large pan-cancer cohort, Rubio- Perez et al. develop an in silico drug prescription strategy based on driver alterations in each tumor and their druggability options and use it to identify druggable targets and promising repurposing opportunities. Rubio-Perez et al., 2015, Cancer Cell 27, 382–396 March 9, 2015 ª2015 Elsevier Inc. http://dx.doi.org/10.1016/j.ccell.2015.02.007

Upload: others

Post on 01-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

Article

In Silico Prescription of An

ticancer Drugs to Cohortsof 28 Tumor Types Reveals Targeting Opportunities

Graphical Abstract

Highlights

d Driver genes are comprehensively identified across a large

pan-cancer cohort

d In silico prescription links approved or experimental targeted

therapies to patients

d Up to 73.3% of patients could benefit from agents in clinical

stages

d 80 therapeutically unexploited targetable cancer driver genes

are identified

Rubio-Perez et al., 2015, Cancer Cell 27, 382–396March 9, 2015 ª2015 Elsevier Inc.http://dx.doi.org/10.1016/j.ccell.2015.02.007

Authors

Carlota Rubio-Perez,

David Tamborero, ...,

Abel Gonzalez-Perez,

Nuria Lopez-Bigas

[email protected]

In Brief

Using a large pan-cancer cohort, Rubio-

Perez et al. develop an in silico drug

prescription strategy based on driver

alterations in each tumor and their

druggability options and use it to identify

druggable targets and promising

repurposing opportunities.

Page 2: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

Cancer Cell

Article

In Silico Prescription of AnticancerDrugs to Cohorts of 28 Tumor TypesReveals Targeting OpportunitiesCarlota Rubio-Perez,1,4 David Tamborero,1,4 Michael P. Schroeder,1 Albert A. Antolın,2 Jordi Deu-Pons,1

Christian Perez-Llamas,1 Jordi Mestres,2 Abel Gonzalez-Perez,1,5 and Nuria Lopez-Bigas1,3,5,*1Biomedical Genomics Lab, Research Program on Biomedical Informatics, IMIM Hospital del Mar Medical Research Institute and Universitat

Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain2Systems Pharmacology, Research Program on Biomedical Informatics, IMIM Hospital del Mar Medical Research Institute and Universitat

Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain3Institucio Catalana de Recerca i Estudis Avancats (ICREA), Passeig Lluıs Companys 23, 08010 Barcelona, Spain4Co-first author5Co-senior author

*Correspondence: [email protected]

http://dx.doi.org/10.1016/j.ccell.2015.02.007

SUMMARY

Large efforts dedicated to detect somatic alterations across tumor genomes/exomes are expected to pro-duce significant improvements in precision cancer medicine. However, high inter-tumor heterogeneity is amajor obstacle to developing and applying therapeutic targeted agents to treat most cancer patients.Here, we offer a comprehensive assessment of the scope of targeted therapeutic agents in a large pan-can-cer cohort. We developed an in silico prescription strategy based on identification of the driver alterations ineach tumor and their druggability options. Although relatively few tumors are tractable by approved agentsfollowing clinical guidelines (5.9%), up to 40.2% could benefit from different repurposing options, and up to73.3% considering treatments currently under clinical investigation. We also identified 80 therapeuticallytargetable cancer genes.

INTRODUCTION

Recent advances in DNA sequencing technologies provide

unprecedented capacity to comprehensively identify the alter-

ations, genes, and pathways involved in the tumorigenic pro-

cess, raising the hope of extending targeted therapies against

the drivers of cancer from a few successful examples to a

broader personalized medicine strategy (Garraway and Lander,

2013; Stratton, 2011). However, large-scale studies confronted

with the high degree of inter-tumor heterogeneity have uncov-

ered long catalogs of cancer driver genes (Davoli et al., 2013;

Kandoth et al., 2013; Lawrence et al., 2014; Tamborero et al.,

2013a; Vogelstein et al., 2013). The goal of using a tailored

Significance

The development of therapies targeting altered driver proteinscancer cells. Nevertheless, the extent of applicability of avalarge pan-cancer therapeutic landscape covering 6,792 tumoagents following their clinical guidelines could provide treatmwe also uncovered promising repurposing opportunities aimportant fraction of cancer patients upon approval. Furtherdevelopment.

382 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

approach to treat all tumor patients may thus require developing

a vast arsenal of anticancer targeted drugs. In addition, ad-

vances in our ability to precisely assign the most effective

targeted therapy to each patient based on the genomic events

driving the tumor are urgently needed. The present study offers

a comprehensive assessment of the potential scope of targeted

drugs in a large pan-cancer cohort. To pursue this goal, we

developed a three-step in silico drug prescription strategy (Fig-

ure 1), which included identifying the driver events acting across

the cohort, collecting all therapeutic agents targeting them, and

connecting each patient to all targeted therapies that could

benefit them, thus producing a landscape of the scope of tar-

geted therapeutic agents in the cohort.

holds the promise of selectively and efficiently eliminatingilable anticancer targeted agents is unclear. Exploiting ars, we found that the prescription of approved therapeuticent only to a small fraction of tumor patients. Nevertheless,nd found that agents in clinical trials could benefit anmore, we identified additional target genes for therapeutic

Page 3: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

Figure 1. Graphical Summary of the Approach

The first step consists of identifying genomic driver events acting in the tumor cohort. Step two involves finding the drugs targeting the driver protein products,

and step three uses the generated information to in silico prescribe drugs to patients on the basis of their genomic driver events. This approach, applied to all

patients in the cohort, provides the therapeutic landscape of cancer drivers shown in the middle panel.

RESULTS

Step 1: Identifying Genes Driving Tumorigenesis in 28Cancer TypesThe first step of the personalized in silico drug prescription

strategy consists in identifying all the alterations driving tumori-

genesis in a patients’ tumor. To implement this step, we first

collected and analyzed somatic mutations (single-nucleotide

variants and short insertions and deletions), copy-number alter-

ations (CNAs), fusion genes, and RNA sequencing (RNA-seq)

expression data in 4,068 tumors of 16 The Cancer Genome Atlas

(TCGA) studies, representing the same number of tumor types

(hereafter, named core cohort). In addition, we collected somatic

mutations for 2,724 additional tumors included in 32 exome

sequencing studies covering 28 cancer types.

The following six sections of results describe this catalog

of cancer driver genes in each tumor type—totaling 475

genes that drive tumorigenesis via mutations, CNAs or gene fu-

sions—as well as their mode of action and whether their driver

mutations are present in the majority of the tumor clonal popula-

tion. All this information integrates the Cancer Drivers Database,

available to researchers via our IntOGen web discovery platform

(http://www.intogen.org; Gonzalez-Perez et al., 2013a).

We Detect 459 Mutational Driver Genes in the 6792Tumors of the Entire CohortThe identification of mutational drivers is challenging due to

both the high heterogeneity in the pattern of somatic mutations

observed across tumor samples and the long tail of cancer

genes mutated at very low frequency. We applied three methods

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 383

Page 4: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

B

A

C D

475 drivers

437

9

72

190

1

mutational drivers

CNA drivers

driver fusion genes

E

Cell cycle & DNA damage Angiogenesis Apoptosis

Proliferation Cell adhesion

Chromatin regulatory factors Splicing factors UPS*

*Ubiquitin proteolisis system

Num

ber

of m

utat

ed s

ampl

esN

umbe

r of

mut

ated

sam

ples

Num

ber

of s

ampl

es

Num

ber

of s

ampl

es

Num

ber

of m

utat

ed s

ampl

es

CLASSICALPROCESSES

EMERGINGPROCESSES

AmplifiedDeleted

Tumor type Tumor Type description Projects Samples Mutational driver genes

ALL Acute lymphocytic leukemia 3 122 12

AML Acute myeloid leukemia 1 196 32

BLCA Bladder carcinoma 1 98 156

BRCA Breast carcinoma 6 1148 184

CLL Chronic lymphocytic leukemia 2 290 38

CM Cutaneous melanoma 2 369 250

COREAD Colorectal adenocarcinoma 2 229 95

DLBC Diffuse large B cell lymphoma 1 23 10

ESCA Esophageal carcinoma 1 146 98

GBM Glioblastoma multiforme 2 379 75

HC Hepatocarcinoma 2 90 30

HNSC Head and neck squamous cell carcinoma 2 375 167

LGG Lower grade glioma 1 169 50

LUAD Lung adenocarcinoma 2 391 181

Tumor type Tumor Type description Projects Samples Mutational driver genes

LUSC Lung squamous cell carcinoma 1 174 147

MB Medulloblastoma 2 210 24

MM Multiple myeloma 1 69 18

NB Neuroblastoma 1 210 27

NSCLC Non small cell lung carcinoma 1 31 11

OV Serous ovarian adenocarcinoma 1 316 83

PA Pylocytic astrocytoma 2 101 2

PAAD Pancreas adenocarcinoma 3 214 21

PRAD Prostate adenocarcinoma 1 243 88

RCCC Renal clear cell carcinoma 1 417 105

SCLC Small cell lung carcinoma 2 69 61

STAD Stomach adenocarcinoma 2 161 175

THCA Thyroid carcinoma 1 322 32

UCEC Uterine corpus endometrioid carcinoma 1 230 149

Total 48 6792 459

(legend on next page)

384 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

Page 5: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

of analysis designed to detect complementary signals of positive

selection in the pattern of somatic mutations in genes across

samples (Gonzalez-Perez and Lopez-Bigas, 2012; Lawrence

et al., 2013; Tamborero et al., 2013b). The assumption is that mu-

tations in certain genes provide a selective advantage to tumor

cells that subsequently grow and proliferate faster; tracing the

signals left by the selection across a cohort of tumors identifies

the genes driving tumorigenesis (Gonzalez-Perez et al., 2013b).

Furthermore, the combination of methods detecting comple-

mentary positive selection signals produces a more comprehen-

sive and reliable list of cancer drivers (Tamborero et al., 2013a).

In the 48 cohorts under analysis (Table S1, part A), we identified

459 mutational cancer driver genes acting in one or more tumor

types, supporting recent reports that many new cancer genes

are yet to be identified (Lawrence et al., 2014; Vogelstein et al.,

2013) (Figure 2A). The list contains genes previously identified

as cancer drivers (Davoli et al., 2013; Futreal et al., 2004; Kan-

doth et al., 2013; Lawrence et al., 2014; Tamborero et al.,

2013a; Vogelstein et al., 2013), as well as many novel driver can-

didates (Figures 2B, S1A, and S1B and Table S1, part D), some

of which have been previously linked to tumor emergence, pro-

gression or metastasis through an array of different alterations

(e.g., NCOR2, MAP3K4, PIK3CB, HDAC9, and NEDD4L). All of

the identified drivers behaved similarly in several features in

which cancer genes are known to deviate from other genes,

such as their abnormally high connectivity in interactions net-

works, significant enrichment for rare germline variants, and

low baseline tolerance to germline variants with functional

impact (Figure S1C). Most of these cancer drivers exhibit low

mutational frequency in the cohort; 63% were mutated in less

than 1% of samples (Table S1, part B). Although some cancer

genes drive tumorigenesis across several tumor types, others

appear to be specific to certain malignancies (Figure 2, Table

S1, part C). Conversely, the number of genes able to drive the

tumorigenesis varies widely between tumor types, ranging

from the long catalogs detected in cutaneous melanoma and

breast carcinoma (250 and 184, respectively) to the short lists

identified in pilocytic astrocytoma and acute lymphocytic leuke-

mia (2 and 12, respectively) (Figure 2A).

A Quarter of Mutational Drivers Are Involved inCell-Regulatory MechanismsMost of the identified cancer drivers are involved in well-known

cancer pathways (Figure 2B) (Hanahan and Weinberg, 2011;

Vogelstein et al., 2013). Within manually selected modules of

genes acting in the same biological pathway, these drivers

tend to be mutated in a mutually exclusive pattern (Figure S1D

and the Supplemental Experimental Procedures). Interestingly,

about 25% of the drivers participate in general cell-regulatory

processes, which are emerging cancer pathways. Three types

Figure 2. The Cancer Drivers Database

(A) Table summarizing the number of cohorts and samples analyzed for each tum

(B) Frequency of somatic mutations of 177 selected mutational drivers of eight ca

long known to be involved in cancer; the bottom panel is dedicated to regulator

togram follow the legend in (A).

(C and D) Frequency of alterations in (C) CNA and (D) fusion drivers across the c

(E) Distribution of drivers according to their type of oncogenic alterations.

See also Figure S1 and Table S1.

of such regulators are widely represented among cancer drivers:

chromatin regulatory factors (CRFs), splicing and mRNA pro-

cessing regulators (SMRs), and members of the ubiquitin-

mediated proteolysis system (UPSs). Furthermore, some of the

cancer drivers belonging to these groups appear to be mutated

at relatively high frequency, such asMLL2, PBRM1, and ARID1A

among CRFs, SF3B1 and DDX3X among SMRs, and VHL and

FBXW7 among UPSs. With the exception of PBRM1 and VHL,

which are specifically linked to the oncogenic process in renal

clear cell carcinoma, cancer genes of these three functional

groups drive tumorigenesis in several malignancies (Figure 2B).

Their role in malignant transformation is most likely mediated

by the misregulation they cause in specific cancer genes that

are directly linked to oncogenic processes (Brooks et al., 2014;

Ferreira et al., 2013; Malhotra et al., 2010; Przychodzen et al.,

2013; Romero and Sanchez-Cespedes, 2014).

The Mode of Action of the 459 Mutational Driver GenesDriver mutations—and by extension the genes that bear them—

can be classified into three groups: (1) loss of function (LoF), typi-

cally tumor-suppressor genes whose disrupted function favors

tumorigenesis; (2) gain of function (GoF), typical of oncogenes

that, when abnormally activated, provide an advantage to the

cell; and (3) switch of function (SoF), which provides a new func-

tion to the encoded protein that promotes malignant transforma-

tion. Identifying the mode of action of drivers is crucial to

exploring the opportunities to therapeutically target drivers

because, in principle, only proteins with activating (Act) muta-

tions (GoF plus SoF) can be directly targeted with molecules

able to inhibit their function. In contrast, LoF genes should be tar-

geted through other strategies, such as the inhibition of a func-

tionally connected gene (Oza et al., 2011), the restoration of their

abolished function (Khoo et al., 2014), and the induction of syn-

thetic lethality like the paradigmatic PARP inhibition for tumors

with BRCA1/2 mutations (Fong et al., 2009).

We used a random forest classifier, OncodriveROLE, to iden-

tify the mode of action of cancer drivers (Schroeder et al., 2014)

based on mutation and CNA gene patterns. We found that 169

(36.8%) of the 459 cancer drivers identified in our multi-tumor

cohort are Act, while 207 (45%) are LoF. The pattern in the

remaining 83 (18.1%) drivers was not clear enough to achieve

unambiguous classification (Supplemental Experimental Proce-

dures, Figure 3A, and Table S2, part A).

A Limited Number of Drivers Clonally Dominate theTumorsNot all driver genes are equally important in the course of tumor-

igenesis. Tumors may be more addicted to mutations in certain

drivers, which provide basic capabilities to cancer cells—such

as the evasion of apoptosis or uncontrolled proliferation—than

or type and the number of mutational drivers detected in each.

ncer-related biological processes. The top panel contains graphs of processes

y processes only recently linked to tumorigenesis. Colors in row-stacked his-

ore cohort.

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 385

Page 6: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

A

C

B

Figure 3. Ancillary Information in the Cancer Drivers Database

(A) Mode of action of mutational driver genes.

(B) Major mutational drivers detected in datasets of the core cohort. Each dot represents a gene placed in the y axis according to their median proportion of clonal

population containing protein affecting mutations in it. Orange dots show the major drivers in each cancer type.

(C) Distribution of the number of altered drivers across all samples of the core cohort (left) and across each of its individual datasets (right).

See also Figure S2 and Table S2.

to others, which render added capacities to the basic tumori-

genic phenotype that allow them to further progress and migrate

(Vogelstein et al., 2013). We hypothesized that mutations in the

former genes, which we call major cancer drivers, should domi-

nate the clonal population of these tumors. We found that muta-

tions in 73 of the driver genes were biased toward large clonal

frequencies across one or more of the 16 tumor samples cohorts

386 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

in which this analysis was possible (Figure 3B and Table S2,

part B). Major cancer drivers should include both genes respon-

sible for the onset of tumorigenesis and genes harboring

mutations that confer such powerful growth advantages to tu-

mor cells that, when sequenced, the clones containing them

have outgrown the founder clone. Although many major cancer

drivers (45) are already well-established cancer genes and are

Page 7: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

therefore annotated in the Cancer Gene Census (CGC), others

are involved in core cellular regulatory processes related to

more recently described cancer processes, such as CRFs,

SMRs, and UPSs. Overall, these major cancer drivers, grouped

in several functional modules, appear to alter few cellular path-

ways upon mutation (Figure S2A). These groups of genes there-

fore constitute interesting targets for anticancer therapies.

We Identify 38 Drivers Acting via CNAs or Fusions in theCore CohortTo complete the landscape of driver genes acting in the cohort,

we also considered those actionable drivers acting via amplifica-

tions, deletions, or gene fusions in the 4,068 tumor samples from

16 cancer types in which these data were available (i.e., the core

cohort). In each of these datasets, we considered candidate

CNA drivers: (1) mutational driver genes in the corresponding tu-

mor type and (2) manually selected targeted genes located

within recurrently amplified of deleted genomic segments identi-

fied by the GISTIC analysis of the corresponding tumor type

(Mermel et al., 2011). In both cases, we only considered as

CNA drivers those genes that exhibited CNAs with coherent

expression changes —i.e., overexpressed Act drivers and

underexpressed LoF drivers (Supplemental Experimental Proce-

dures). As a result, we identified 29 targeted genes putatively

driving tumorigenesis in one ormore of thesemalignancies either

via amplification (15) or deletion (14) (Figure 2C and Table S1,

part E). Nine of these genes were driving cancer exclusively via

CNAs: AURKA, CCND2, CCND3, CCNE1, CDK6, CDKN2B,

IGF1R, MDM2, and MDM4. Moreover, 20 other genes were

also mutational drivers (i.e., PTEN and CDKN2A) (Figure 2E).

Finally, we collected a list of genes (ten) known to drive tumori-

genesis upon fusionsbymanually curating the literature (Figure2D

and Table S1, part F). Only high reliable fusion events per sample

for in-frame fusions of both partners were taken into account.

The Cancer Drivers Database Identifies TumorigenicAlterations in 90% of TumorsNinety percent of the 4,068 tumor samples in the core cohort

bear at least one driver alteration defined in the Cancer Drivers

Database. Tumor samples in the core cohort contain a median

of three driver events, although the number of events per sample

differs widely across tumor types (Figures 3C). These figures are

actually underestimated due to, among others, lowly recurrent

drivers that remain unidentified at the current number of tumor

re-sequenced genomes and unidentified driver non-coding mu-

tations not assessed by exome-sequencing data, as well as

other types of genomic or epigenomic alterations not considered

in the present analysis. When only mutations are considered, the

fraction of tumors with identified driver events drops from 90% to

86% in the core cohort and to 82% in the full cohort (Figure S2B).

Step 2: Collecting Drugs Targeting Driver GenesThe second step of the in silico prescription strategy consisted in

exhaustively searching for therapeutic options, either approved

by the Food and Drug Administration (FDA) or in clinical or

pre-clinical studies, to target mutational, CNA, and fusion driver

genes detected in step one. We considered three different plau-

sible approaches to target them: (1) inhibition of activated drivers

with therapeutic molecules (direct targeting), such as Vemurafe-

nib-targeting V600E-activating BRAF mutation; (2) inhibition of

non-altered proteins functionally connected to the altered driver

(indirect targeting), for example Temsirolimus, a MTOR inhibitor

undergoing a clinical trial (Oza et al., 2011) to treat cancer

patients bearing PTEN-inactivating mutations; and (3) gene

therapies designed to compensate for the loss of activity of a

tumor-suppressor driver. The process employed to identify

these interactions and ancillary information relevant to their pre-

scription to cancer patients, as well as its results, is described in

the following two sections. All this information composes the

Cancer Drivers Actionability Database (Supplemental Experi-

mental Procedures) made available to researchers through

IntOGen (http://www.intogen.org/downloads).

Drugs Targeting Cancer DriversOur thorough search for anticancer targeted therapeutic options

identified 57 FDA-approved agents targeting 51 drivers, 47

agents in clinical trials targeting 66 drivers—26 on top of the pre-

vious 51—and 20,470 ligands potently binding 77 drivers—19

non-reported with respect to the previous lists—totaling 96 tar-

geted drivers (Figures 4A and 4B and Table S3). Seventy-four

out of them were exclusively directly targeted, whereas 13

were only targeted indirectly; seven drivers could be targeted

both directly and indirectly. We also found two gene therapies

aimed at restoring the activity of two drivers (Figure 4C). Drivers

targeted by agents following each of the three strategies with

different levels of approval in the clinical practice are shown in

Figure 4D. Finally, we found that 35 other drivers had a protein

structure suitable for small molecule binding, named potentially

druggable, and 26 could be accessed by antibody and protein

therapies, named potentially biopharmable (Figure 4A).

Of special interest in the catalog of targeted therapies are

agents targeting multiple drivers, which could provide a wider

therapeutic scope (Hopkins, 2008; Mestres et al., 2008). We

observed that 32 out of 57 (56.14%) FDA-approved drugs target

multiple drivers (between 2 and 17), stressing the fact that a large

proportion of cancer drugs are poorly selective. Multi-target

options are likely to increase in the near future, as 23 therapeutic

agents in clinical trials and 4,746 preclinical ligands target multi-

ple drivers (Figure S3).

Cancer Drivers Actionability DatabaseWe complemented the list of therapeutic anticancer agents with

ancillary information on their clinical use and the options for re-

purposing them and integrated all these data within the Cancer

Drivers Actionability Database. The database contains informa-

tion on the tumor type for which the drug is prescribed (for

FDA-approved drugs) or is being tested (for drugs in clinical tri-

als). It also contains rules that take into account the type of driver

alteration considered actionable, according to clinical guidelines

(e.g., Afatinib is clinically prescribed for the EGFR L858R muta-

tion) and/or according to the specification in clinical trials or in

the literature (e.g., Imatinib is being tested for PDGFRA muta-

tions). Additionally, the database considers alterations known

to be causative of drug primary resistances (e.g., KRAS muta-

tions cause resistance to Cetuximab), according to clinical

guidelines or clinical or pre-clinical investigations.

The database also defines rules for the possible expansions of

the use of approved drugs in cancer treatment. In particular, we

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 387

Page 8: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

A B C D

Figure 4. Drugs Targeting Cancer Drivers(A) Number of driver genes within each unique druggability category. Note that each driver gene is counted only once in this plot and is placed in the category

according to the ordered hierarchy.

(B) Number of driver genes that are targets of therapeutic agents with the three levels of approval in the clinical practice.

(C) Distribution of driver genes susceptible to the three types of targeting strategies with different levels of approval in the clinical practice.

(D) Distribution of driver genes, within each targeting strategy, targeted by agents with the three levels of approval in clinical practice.

See also Figure S3 and Table S3.

have considered six cases where the repurposing of FDA-

approved drugs could prove useful. We stratified them in three

tiers, according to their feasibility (Supplemental Experimental

Procedures). Tier 1 includes (1) tumor type repurposing (tumor

types different than the one considered in the clinical guidelines)

and (2) disease repurposing (agents approved for other diseases

that are however able to target cancer driver proteins). Tier 2 in-

cludes (1) alteration repurposing (agents targeting an alteration

of a driver that could be employed in the event of another type

of alteration of the same gene), (2) indirect targeting repurposing

(agents that target a specific oncoprotein but have proven useful

counteracting the alteration in another connected driver), and (3)

strong off-target repurposing (agents that bind more potently

than drug primary targets, the clinically intended ones). Tier 3

includes mild off-target repurposing (agents which bind to off

targets with an affinity less potent than the one for their primary

targets, but more potent than 1 mM). We have included in the

database rules to filter out 12 known cases of repurposing

that have rendered negative results in clinical trials (Table S4,

part B), obtained manually from the literature. Furthermore, we

have collected information from 98 currently ongoing clinical tri-

als that assay repurposing opportunities considered in our data-

base, some of which have proven successful (Table S4, part A).

Step 3: Connecting Targeted Therapeutic Agents toPatientsThe third step of the in silico prescription strategy consists in

determining which targeted therapies could benefit a patient

(Figure 1). To make this decision, we select the driver alterations

in the tumor on the basis of the Cancer Drivers Database. Then,

therapies are matched to those alterations taking into account

388 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

the interactions and rules in the Cancer Drivers Actionability

Database.

We applied this procedure to the 4,068 patients of the core

cohort, taking into account the observed somatic mutations,

CNAs, and fusion events (Table S4, part D), and to the full cohort

of 6,792 patients considering only somatic mutations (Table S4,

part C). The obtained therapeutic landscapes allowed us to

answer (1) what fraction of the patients of these tumor types

could benefit from currently available targeted therapies, (2)

what fraction of them might benefit from targeted therapies

currently under clinical trials or in pre-clinical stages, and (3)

what fraction of them may receive multiple agents aimed at

improving their response to therapy. The following five sections

describe these targeted therapeutic landscapes and answer

these questions.

Approved Targeted Therapies Could Benefit 5.9% of thePatients in the Core CohortOf the 4,068 samples of the core cohort, only 241 are in silico-pre-

scribed FDA-approved drugs following their primary indications.

In other words, only 5.9% of the patients represented by this

pan-cancer cohort could be treated with targeted therapies

directed to modulate the activity of mutated, amplified, deleted,

or fuseddriverproteins under current clinical guidelines (Figure 5).

All samples tractable with approved targeted approaches are

concentrated in six tumor types: cutaneousmelanoma, colorectal

adenocarcinoma, headandneck squamouscell carcinoma, renal

clear cell carcinoma, breast carcinoma, and acutemyeloid leuke-

mia (Figure 6A).Mostof these (101) arebreast carcinomasamples

bearing ERBB2 amplifications, with in silico prescriptions for

Lapatinib, Pertuzumab, and Trastuzumab (Figure 6B).

Page 9: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

A B

Figure 5. Therapeutic Landscape of the Core Cohort

Sets of patients susceptible of different types of interventions are shown both in their overlaps (A) and as non-overlapping groups following a hierarchy (B)

(i.e., each patient is uniquely labeled in one category following the ordered hierarchy). The same classification for full cohort is available in Figure S4A. See also

Table S4.

When only drugs targeting mutational drivers in patients of the

full cohort are considered (Figure S4), there is a substantial

decrease in the proportion of tumors susceptible of treatment

with FDA-approved drugs according to clinical guidelines

(2.9%). The contrast between these two results highlights the

significance of exploring driver CNAs and fusions when making

clinical decisions on cancer patients. In any case, the proportion

of samples benefiting from FDA-approved drugs following clin-

ical guidelines is still slim.

Drug Repurposing Increases up to 40.2% the Fraction ofPatients that Could Benefit from FDA-Approved DrugsDrug repurposing is a promising strategy to maximize the bene-

fits of currently available drugs (Gupta et al., 2013). The in silico

drug prescription procedure assigned FDA-approved drugs ac-

cording to one or more of the repurposing strategies considered

here to 1,635 patients of the core cohort (40.2%) (Figure 5A).

As mentioned above, tumor type repurposing was divided in

three tiers according the feasibility of the therapy. Repurposing

opportunities in tier 1 were mostly tumor type repurposing cases

(835 out of the 838 patients benefiting from opportunities in the

tier). Moreover, 697 of these patients could not benefit from

approved drugs according guidelines (Figure 5B). Most of these

opportunities were concentrated in three cancer types, amount-

ing to 182 thyroid carcinoma, 168 glioblastoma, and 56 lung

adenocarcinoma tumors (Figures 6A and 6B). Kinase inhibitors

represent the highest number of tumor type repurposing oppor-

tunities among all drug families, including cases of multiple

in silico drug prescriptions for one tumor sample.

Repurposing opportunities in tier 2 could benefit 867 patients,

out of which 587 could not be linked to more feasible therapies.

FDA-approved drugs through indirect targeting repurposing

were assigned to 692 patients. Most of these samples were colo-

rectal (120), uterine carcinomas (112), and glioblastomas (99).

Deletions and/or truncating mutations of PTEN (377 samples),

NF1 (148), and APC (138) received the largest share of these

indirect opportunities (Figure 7A). On the other hand, alteration

repurposing opportunities were concentrated in glioblastoma

patients (76). Most of these patients (70) could receive Lapatinib

tosylate, prescribed for ERBB2 overexpression, to target EGFR

mutations.

Lastly, repurposing opportunities in tier 3 could be beneficial

for 1,153 patients, 110 of which could not be prescribed

more feasible therapies. In summary, the fraction of samples

benefiting from FDA-approved drugs prescribed in silico in-

creases markedly (from 5.9% to 40.2%) if repurposing opportu-

nities are considered, with important differences between tumor

types (Figure 6). Nevertheless, the usefulness of repurposing to

treat cancer patients is slimmer if only somatic mutations are

considered, from 2.9% to 24.7% in the full cohort (Figure S4).

1,346, or 33.1% Further Patients in the Cohort CouldBenefit from Drugs Currently Under Clinical TrialsWhen targeted therapies currently in clinical trials are also

considered, the number of tractable tumors increases to 2,981

(73.3%) in the core cohort (Figure 5). Among them, 1,672 sam-

ples were in silico-prescribed drugs in clinical trials directly

targeting their driver genes, 1,254 were found susceptible of

undergoing gene therapies correcting for the LoF of a tumor-

suppressor gene, and 1,284 could receive drugs in clinical trials

indirectly targeting their driver alterations (Figure 6B). The impor-

tance of these drugs will prove paramount to patients suffering

from certain malignancies, such as lower-grade glioma, ovarian

cancer, and head and neck carcinoma, 76.2%, 69.1%, and

55.6% of whom, respectively, could benefit from agents in

clinical trials, but not from any other FDA-approved targeted

therapies. These numbers are explained, to a great extent, by

TP53 (Figure 7B), which is the subject of a gene therapy that

can compensate for its LoF mutations (Buller et al., 2002) and

is altered in 1,250 samples. The overall fraction of patients

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 389

Page 10: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

A

B

C

Figure 6. The Landscape of Targeted Therapies in Patients of the Core Cohort(A) Distribution of patients across tumor types of therapeutic interventions defined in the text (non-overlapping groups as in Figure 4C) in each cohort.

(B) Number of patients susceptible of each types of therapeutic intervention (overlapping as in Figure 4B) in each cohort. The bars at the right of the heatmap

illustrate the total number of patients benefiting from each type of therapeutic intervention.

(C) Distribution of patients susceptible to receive different number of targeted therapeutic agents—based on their altered drivers—in each cohort (left) and across

the entire core cohort (right). (Only drivers targeted by FDA-approved agents and agents in clinical trials were considered.)

See also Figure S4.

390 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

Page 11: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

benefiting exclusively from drugs in clinical trials is almost

equally significant in the full cohort (34.9%; Figure S4).

Finally, 2,055 patients (50.5%) were assigned ligands currently

in preclinical stages, and only 58 of them are not prescribed

agents either in clinical guidelines or FDA approved; the tumors

of 68 and 34 other patients have protein-affecting mutations in

potentially druggable and biopharmable drivers, respectively

(Figure 5).

39% of Patients in the Core Cohort Could ReceiveCombination Therapies against Multiple DriversThe combination of multiple agents is probably the most intuitive

strategy to improve the efficacy of the response of tumors to

treatment and to diminish their resistance to targeted therapies

(Al-Lazikani et al., 2012; Pemovska et al., 2013). The in silico

drug prescription approach identified 1,598 patients (39%) of

the core cohort with sets of altered drivers susceptible of being

targeted with combinations of therapeutic agents both approved

and undergoing clinical trials (Figure 6C). In other words, an

important fraction of the patients in the cohort could benefit

from existing or future combination therapies, although the ma-

jority of patients could currently only be prescribed a single tar-

geted agent. The landscape of available combinatorial targeting

strategies varies per tumor type (Figure 6C).

Polypharmacology could, in principle, achieve results similar

to drug combinations targeting more than one altered driver in

a tumor, without the potential problem of added toxicity. Sixty-

nine samples presented mutations in at least two drivers that

could be simultaneously targeted by the same FDA-approved

drug. Up to 451 (11.1%) patients would benefit from polyphar-

macology if the drugs in clinical trials are also considered. These

results highlight the need to develop new and smarter anticancer

therapies and strategies to expand the landscape of combina-

tion therapies.

Priority Gene Candidates for Drug DevelopmentIn our thorough survey of anticancer drugs and their targets, we

detected 19 non-LoF drivers tightly bound by preclinical small

molecules that are not currently targeted by any drugs that are

FDA-approved or in clinical trials (Figure 4A). Moreover, 61 other

non-LoF drivers present properties that make them potentially

druggable or biopharmable (Figure 8). These 80 drivers are

altered in 2,730 (67.1%) samples across the cohort. Interest-

ingly, 63 of them are not well-established cancer genes

(not included in the CGC). Six of them, bearing protein affecting

mutations in 370 (5.4%) samples, are major drivers, a fact

that makes them particularly promising as targets for drug

development.

Focusing on the 25 drivers in this list that bear mutations in

more than5%of tumors of at least one cancer type,which consti-

tute probably the first-line candidates for drug development, we

discovered 13 cancer genes that are not well-established (Table

S5 describes ten of them in detail). At least four of these targets

(SVEP1, PCDH18, PTPRU, and FAT2) function in biological pro-

cesses related to cell adhesion and migration, while three others

(KALRN, MACF1, and TRIO) modulate changes in the organiza-

tion of the cytoskeleton. Another gene is involved in the regulation

of cell cycle, TAF1, encodes one component of the transcription

initiation complex and is thus essential for the progression of the

cell cycle into G1. The possibility to develop therapeutic agents

targeting cell-cycle mediators is exciting because most drivers

affecting this process lose their function upon mutation. Other

potentially actionable drivers encode a component of an ubiqui-

tin-mediated proteolysis complex (SPOP), a cell membrane re-

ceptor involved in exocytosis (LPHN2), and an enzyme involved

in the pyrimidine synthesis pathway (CAD) (Figure 8). CTNNB1

presents especially appealing opportunities to develop targeted

therapeutic agents to treat an important fraction of the patients

suffering from uterine carcinomas (28%) and medulloblastomas

(12%), two tumor typeswith fewcurrently available targeted ther-

apeutic options.

DISCUSSION

The availability of the sequences of thousands of tumor ge-

nomes has raised the hope of being able to identify all cancer

driver events that could be targeted by existing and novel

drugs. This in turn is expected to extend targeted therapies

from a few successful examples to a toolbox of tailored drugs

that deliver the promise of personalized cancer medicine. In

this work, exploiting one of the largest available tumor cohorts

(6,792 samples), we sought to (1) determine the scope of appli-

cability of anticancer targeted drugs and (2) propose ranked

lists of genes to develop novel targeted drugs. Pursuing these

goals, we devised an in silico prescription strategy of targeted

therapeutic agents. The strategy includes the development of

a Cancer Drivers Database, containing lists of genes driving

tumorigenesis upon mutations, CNAs, and fusion events in

different tumor types, and a Cancer Drivers Actionability Data-

base, containing a comprehensive set of current and prospec-

tive anticancer targeted agents and sets of rules to prescribe

them to patients. Connecting targeted agents to patients in

the cohort, we obtained its therapeutic landscape as a snap-

shot in time, which will be improved with more updated Cancer

Drivers Database and Cancer Drivers Actionability Database in-

formation as our knowledge of driver genes and anticancer

therapies progress.

The analysis presented here has some limitations inherent to

the nature of the data employed. The identification of driver

genes is limited by three main hurdles. First, our analysis does

not include non-coding mutations and methylation changes.

Second, lowly recurrent drivers not well represented in the

analyzed cohort remain unidentified. Third, the data employed

to detect drivers may contain intrinsic errors, like those intro-

duced by the calling of somatic mutations.

Also, we relied on proxy features to predict the mode of action

of each driver as Act or LoF, with remarkably good results for the

CGC; however, 83 drivers of our list could not be unambiguously

classified, especially those whose mutations pattern could

not be assessed due to their low frequency. Furthermore, we

acknowledge that some genes exhibit divergent modes of action

in different cancer types (Schroeder et al., 2014).

The analysis of clonal frequency is limited by the use of low-

coverage sequencing data. These were suitable, however, to

our endeavor of identifying major drivers, whose therapeutic

intervention could exhibit a larger benefit. The analysis is based

on detection of genes bearing mutations biased toward larger

clonal frequencies across each tumor cohort, which is less

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 391

Page 12: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

A

B

(legend on next page)

392 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

Page 13: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

ALL

AML

BLCA

BRCA

CLL

CM

COREAD

DLBCL

ESCA

GBM

HC

HNSC

LGG

LUAD

LUSC

MB

MM

NB

NSCLC

OV

PA

PAAD

PRAD

RCCC

SCLC

STAD

THCA

UCEC

MA

CF

1

FAT

2

SV

EP

1

RO

BO

2

PC

DH

18

GN

AS

CO

L1A

1

PT

CH

1

CU

X1

ZN

F63

8

LRP

6

EIF

4G3

PLX

NA

11

PLX

NB

2

FC

RL4

PO

M12

1

ZN

F81

4

DLG

1

NT

N4

GO

LGA

5

ND

RG

1

RH

OT

1

AC

SL3

CLC

C1

AC

AD

8

WN

T5A

CT

NN

B1

PC

SK

5

TR

IO

AC

AC

A

CA

D

AD

CY

1

NE

F2L

2

PT

PR

F

BLM

PT

PN

11

PC

SK

6

AH

R

XP

O1

LNP

EP

AD

AM

10

CA

RM

11

AT

F1

MG

MT

MD

M4

KA

LRN

F8

PLC

B1

LPH

N2

TAF

1

WN

K1

PT

PR

U

AR

FG

EF

2

PR

PF

8

US

P6

SO

S1

NU

P98

CN

TN

AP

1

STA

RD

13

DH

X9

AR

ID4A

PLC

G1

SP

OP

PP

P2R

1A

ITG

A9

NE

DD

4L

TN

PO

1

CA

T

RH

OA

PA

X5

RA

C1

TR

IP10

NR

2F2

NC

K1

CA

PN

7

PIK

3R3

PP

P2R

5C

AR

FG

AP

1

MA

T2A

PS

MA

6

Total mutatedsamples

Mutationfrequency

0

0.1

1

Potentially druggablePotentially biopharmablePre-clinical Ligand

CGC gene

Major driver gene

Potentially biopharmablePotentially druggablePre-clinical ligand

184 134 128 107 103 101 93 89 81 67 50 48 46 38 35 29 17 15 15 184 151 142 131 128 114 93 89 88 84 79 77 70 60 60 59 57 55 54 50 41 34 32 30 29 28 28 23 22 21 21 21 17 15 15 265 218 198 156 153 113 103 97 95 84 83 83 82 81 65 52 50 46 37 34 29 28 27 24 21 20

1 1 1 1 1

1 2 1 1

8 6 3 7 2 7 2 3 1 1 3 5 5 5 4 1 1 2 2 3 4 1 1 1 2 1

11 11 5 11 19 9 9 9 7 3 7 10 7 3 5 3 8 1 3 4 6 4 2 1 2 4 1 4 2 1 1 2 2

2 1 5 1 1 1 1 1 1 1 1

52 52 51 42 18 22 22 20 21 35 11 18 12 19 12 10 8 3 2 17 12 8 9 4 6 14 4 4 1 2 3 7 3 7 5

7 1 2 3 3 3 9 4 1 1 2 3 3 3 1 3 1 1 1 1 2 1

3 2 4 5 2 2 2 3 1 1 1 2 4 5 2 2 1 2 2 2 1

2 6 1 1 4 2 1 3 2 5 4 2 2 2 2 5 2 2 1 3 1 2 1 1

1 1 5 1 1 2 1 1 1 1 1 1 1

23 7 13 8 7 14 6 6 6 6 7 4 6 4 7 1 4 3 3 1 2 2 2 6 5 9 5 4 5 1 3 3 3 4 1

1 3 2 1 3 3 1 2 1 2 4 1 2 1

18 19 20 21 9 8 9 7 3 9 10 6 4 8 5 5 6 1 3 1 3 3 3 1 5 1 2 4 2 1 4 1 2

15 9 6 7 12 17 6 2 5 8 10 3 4 6 5 8 3 1 8 2 2 3 1 2 2 2 3 1

1 1 1 3 1 1 1 1 1

1 1 2 1 1 1 2

1 1 2 1 1 1 3 1 1 1 1 1 1

3 2 2 1 1 1 1

4 5 4 2 5 3 1 1 1 3 1 4 1 1 2 1 4 1 1 1 2 1 1 1 1 1

3 1 1 1 1 1 1 1 1 1

1 1 1 2 1 1 3 1 2 6 22 1 1 2 1 1 1 1 1

5 3 1 2 5 5 4 4 6 2 6 2 5 2 4 2 2 2 1 1 2 1 1 1 1 1 3

5 7 1 1 6 1 4 1 2 2 3 2 2 1 3 2 2 1 4 1 1

8 7 10 5 5 7 9 9 8 1 2 7 5 7 3 6 4 2 2 6 2 3 3 7 1 2 4 4 2

1 1 1 1 4 2 1 1 1 1 1 2 1 1 2 1 1

13 6 2 6 20 2 6 9 11 2 7 5 4 2 2 6 7 15 20 3 5 5 2 1 1 1 1 2 2 2 2 3

1 3 1 1 1

1 1 1 1

14 6 6 3 5 4 1 5 4 2 1 8 1 6 1 3 4 3 1 3 2 2 1 1

33 11 17 8 7 5 6 7 2 6 5 5 6 4 2 1 5 7 2 2 2 1 1 1 2 2

1 1 1 2 1 1 1 1 1 1 1

43 52 72 42 69 24 46 12 23 17 23 21 16 17 19 17 4 10 14 2 10 9 8 16 1 3

7 11 4 5 5 8 4 3 3 4 4 1 2 1 1 3 3 1 1

3 7 4 10 13 5 6 1 3 2 3 1 2 1 2 2 3 1 1 1 1 3

11 11 4 2 2 2 1 1 2 2 1 1 1 2 2 10 1 4 1 1 1 2

2 2 3 1 1 2 1 1 1 1

29 28 16 11 8 8 7 9 13 5 17 4 15 16 12 3 5 5 1 1 1 3 2

1 2 1 1 2 3 3 2 1

21 16 29 23 16 17 5 12 8 11 6 11 8 7 7 1 3 1 5 1 1 1 2 3 1

27 14 7 9 3 7 8 4 8 6 6 2 9 4 2 4 4 2 2 3 4 2 2 1

1 2 13 2 1 1

2 1 2 1 1 1 1 1 1

3 7 1 2 1 2 2

1 1 1 1

10 2 5 2 1 3 1 6 4 1 1 1 3 2 3 1 2 3 3 2 1

1 1 2 4 1 1 2 1

4 6 2 2 3 2 1 1 1 2 1 2 3 1 2 1 6 1 2

12 5 8 3 2 5 1 8 7 1 2 4 2 2 4 2 6 1 2 5 1 1 2 1

9 5 3 1 2 1 1 2 4 1 7 2 2 3 1 1 1 1

15 15 6 10 7 8 5 3 3 4 2 3 4 5 4 5 3 2 2 6 2 2

1 1 1 2 3 1 1 1 1 1 2 1 1

19 12 10 6 3 9 4 8 11 5 8 8 9 9 7 2 4 3 2 1 1 1 2

3

1 9

2 6 10 6 5 1 9 2 4 2 4 3 2 2 3 1 1

2 14 10 7 9 5 2 8 5 2 3 2 7 1 1 2 2 2 3

1 2 2 1 2 1 1 1 2 1 8 1 1

19 60 26 24 20 30 4 20 24 9 21 8 3 12 8 6 5 4 3

9 2 3 4 6 5 3 1 2 1 1 1 1

3 4 3 2 1 6 2 2 2 1 3 1 2 1

1 1 2 4 3 5 6 2 5 4 4 1 2 2 2 1 1

17 3 1 5 1 4 1 1

2 4 12 12 3 10 20 10 5 1 1 3 6 2 3 3 1 5 1

1 2 5 2 2 3 1 1 1 1

10 11 14 6 11 8 4 2 6 9 5 6 2 1 3 3

3 5 7 2 10 8 26 5 2 3 4 1 2 2 3 1 1

26 2 2 1

1 1 1 2 1 1 1

1 3 3 1 7 3

1 1 1 1 1

2 2 2 5 4 4 4 1 1 2 1 2

2

1 2 2 1

9 3 3 1 2 1 1 1 1

1 4 2 5 3 1 5 3 7 1 3 3 6 3 4 2

1 1 1 2 2 6 1 1 2 3 1 1

10 8 10 7 9 3 2 7 7 4 1 2 3 2 3 1 2 3

2 1 2 1 1 1

65 3 11 9 9 4 12 7 4 2 1 2 2 2 1 2 2

Figure 8. Mutation Frequency of Cancer Driver Genes that Are Bound by Pre-clinical Ligands or are Potentially Druggable or Biopharmable

See also Table S5.

sensitive to the accuracy of the data as compared to other ana-

lyses aimed to reconstruct the exact clonal architecture of each

tumor.

Finally, due to the inevitable incompleteness of drug-target

interaction databases, the results we present here are an under-

estimation of the actual number of targeted drivers (Mestres

et al., 2008).

Our in silico prescription approach, based solely on actionable

alterations in drivers, should be treated as an approach to esti-

mate the maximum applicability of direct targeted therapies in

the patients represented in our cohort. The rationale presents

five main limitations: (1) all coherent alterations in driver genes

(protein affecting mutations, amplifications, deletions, and

fusions) acting in the corresponding tumor type are considered

driver events, although some may well be passengers; (2)

intra-tumor heterogeneity in each individual sample is not taken

into account for the prescription of drugs due to limitations of low

coverage sequencing data, although it is conceivable that tar-

geting subclonal driver events may have lower efficacy or even

lead to paradoxical effects in some cases (McGranahan and

Swanton, 2015); (3) the Cancer Drivers Actionability Database

considers resistance biomarkers of targeted drugs, although

this is limited by the current landscape of already known bio-

markers of primary response; (4) although a set of known nega-

tive repurposing cases are eliminated by the in silico prescription

algorithm, these are limited to published results of clinical trials;

and (5) the approach does not take into account specificities

Figure 7. Examples of Targeted Drivers in the Core Cohort

(A) The top 20 genes ranking at the top of the number of patients tractable with FD

left illustrates the numbers of samples with alterations of each gene in each tum

genes.

(B) Same as (A) for genes ranking at the top of the number of patients tractable

targeted genes.

See also Figure S5.

pertaining to the mode of administration of each drug that may

limit their repurposing opportunities.

Despite the aforementioned limitations, the approach pre-

sented here constitutes the proof of principle of a straightforward

strategy to intelligently exploit cancer genome information to aid

toward precise cancer medicine (Garraway, 2013). This strategy,

which combines a Cancer Drivers Database and a Cancer

Drivers Actionability Database by means of an in silico prescrip-

tion method, presents two unique features.

First, our approach considers which alterations are drivers in

each cancer type. It therefore diverges from other studies that

start with a list of pre-defined cancer proteins and consider pre-

scribing all drugs targeting their observed alteration regardless

of their importance in the corresponding tumor type (Van Allen

et al., 2014). The intervention only on driver events avoids over-

prescription (Figure S4D). The information of all driver alterations

in the study cohort, provided via the IntOGen resource (Gonza-

lez-Perez et al., 2013a), can be exploited for other purposes by

the cancer research community, such as the support for inter-

pretation of newly sequenced tumor genomes or the design of

optimal sequencing panels.

Second, targeted therapies in the Cancer Drivers Actionability

Database comprise manually curated anticancer drugs currently

in use in the clinic or currently under clinical trials, as well as

automatically retrieved ligands that bind to cancer genes and

are candidates for prospective drug development. Another

feature of this database is the explicitly defined rules that guide

A-approved targeted agents across the entire core cohort. The heatmap at the

or type separately. See Figure S5A for the complete heatmap of the targeted

with agents in clinical trials. See Figure S5B for the complete heatmap of the

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 393

Page 14: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

the different types of repurposing. In these two respects, it is

different from other existing resources that compile anticancer

drugs and their interactions with cancer proteins (Van Allen

et al., 2014; Halling-Brown et al., 2012).

The in silico drug prescription based on these two resources

was tested on one of the largest cohorts of tumor samples

currently collected for research. Themain result highlights the cur-

rent scope of targeted anticancer therapies and its prospects for

growth in view of the drugs that are currently in clinical trials or at

pre-clinical stages. The second important output of this work is a

ranked list of additional target opportunities for anticancer drug

development. Continuous update of drug-target interaction infor-

mation, as well as the application of the strategy to larger cohorts,

will improve the in silico prescription rules contained within the

Cancer Drivers and Cancer Drivers Actionability Databases, thus

enhancing its usefulness within personalized cancer medicine.

EXPERIMENTAL PROCEDURES

Somatic Alterations Data

We obtained the lists of somatic mutations detected across tumor samples of

48 individual cohorts from three main sources: TCGA pan-cancer (Weinstein

et al., 2013) (syn1710431 and syn300013), the ICGCData Coordination Centre

(Hudson et al., 2010), and supplementary material of other published large

cancer cohort publications. Details on all datasets are provided in Table S1,

part A. The datasets were filtered as previously published (Tamborero et al.,

2013a) to eliminate possible false-positive somatic variants. Gene expression

and copy-number data of the 16 TCGA cancer projects were obtained from the

Synapse platform (syn300013).

Identification of Mutational Cancer Drivers

We searched for three complementary signals of positive selection in the

mutational pattern of genes across the samples of each individual cohort

of tumors and across the TCGA pan-cancer cohort. To that effect, we em-

ployed three methods designed to detect (1) abnormal accumulation of

high-impact mutations across samples (OncodriveFM [Gonzalez-Perez and

Lopez-Bigas, 2012]), (2) abnormal clustering of mutations in regions of the

protein sequences (OncodriveCLUST [Tamborero et al., 2013b]), and (3)

the recurrence of mutations above the background mutation rate (MutSigCV

[Lawrence et al., 2013]). The lists of genes identified by these methods were

combined as described in Tamborero et al. (2013a) and the Supplemental

Experimental Procedures to obtain the final list of mutational drivers acting

in each tumor type.

Classification of Mutational Drivers According to their Mode of

Action

We used OncodriveROLE (Schroeder et al., 2014), a random forest classifier-

based tool trained on several parameters related with the alteration pattern of

genes across samples in the TCGA pan-cancer cohort, to predict whether

driver genes act as LoF or activating (GoF/SoF) drivers. The training, test,

and parameters employed in this classification are detailed in Schroeder

et al. (2014) and the Supplemental Experimental Procedures.

Identification of Major Mutational Drivers

We took the ratio between the number of reads supporting the mutant and the

reference alleles as an estimator of the variant allele frequency (VAF) of the mu-

tation.Geneswithin amplifiedchromosomal segmentswere excluded since it is

unknown to which extent they affect the mutant and wild-type allele counts.

Several measurements were applied to correct the VAF values to take into ac-

count heterozygousdeletions, copy-neutral loss-of-heterozygosity events, and

differences in tumor purity between samples.Wecomputed the observed trend

of each driver to bear mutations with higher VAFs. We as considered major

drivers of each tumor cohort those that exhibited a significant bias toward a

higher VAF compared to other genes (Supplemental Experimental Procedures).

394 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

Identification of CNA and Fusion Drivers

To identify genes putatively driving tumorigenesis in TCGA cohorts via CNAs,

we selected known cancer genes with targeted drugs found within recurrently

altered segments according to GISTIC (Mermel et al., 2011) and genes identi-

fied as mutational drivers by the aforementioned analysis per each cohort.

Only those LoF drivers exhibiting homozygous deletions and concomitant

downregulation, and non-LoF drivers with multi-copy amplifications and that

were upregulated, were included as putatively CNA drivers. LoF genes that ap-

peared to be heterozygously deleted and mutated in the same sample were

also included (Supplemental Experimental Procedures). In parallel, a list of

genes known to cause malignization via fusions was manually retrieved from

the literature. Then, we interrogated the TCGA Fusion Gene Data Portal (Yosh-

ihara et al., 2014) to obtain a list of the samples exhibiting in-frame fusions of

these genes with any partner in the core cohort (Supplemental Experimental

Procedures).

All driver genes in each tumor type of the cohort (Supplemental Experi-

mental Procedures), the evidence supporting their involvement in oncogenesis

and their roles in malignization—and consequently the type of their somatic

alterations relevant to treatment—and their relative importance within the

clones making the tumor were collected into the Cancer Drivers Database.

Curation of Drug-Target Interactions

We considered three different targeting strategies: gene therapies and direct

and indirect targeting. Moreover, we grouped therapeutic agents within the

categories in FDA approved, agents in cancer clinical trials, and agents in

cancer pre-clinical ligands.

Direct targeting interactions were mainly retrieved from ChEMBL (v18)

(Gaulton et al., 2012), a manually curated chemical database of bioactive

molecules. We took into consideration all interactions with affinity above

1 mM. For FDA-approved drugs, we distinguished three different types of tar-

gets: primary (as specified in FDA label, according ChEMBL), strong off target

(interacting more potently than the primary targets of the drug), and mild off

target (more potent than 1 mM but less than the primary target). We also

included some direct interactions from other sources (Supplemental Experi-

mental Procedures). Indirect targeting candidates were mainly retrieved

from the TARGET database (Van Allen et al., 2014). However, interactions

with molecules were obtained from literature, expert annotation, the drug’s

FDA label, or ClinicalTrials.gov (http://clinicaltrials.gov/). We also included in-

direct targeting of ligands by therapeutic agents targeting their receptors.

Gene therapies were included from records in The Journal of Gene Medicine

Clinical Trial site (Ginn et al., 2013), a comprehensive source of information

on worldwide gene therapy clinical trials.

Next we added information for the tumor type of clinical prescription (for

FDA-approved drugs, manually curated) and the tumor type of study (for

drugs in clinical trials). We incorporated rules that take into account specific

genomic dependencies for targeting, according clinical guidelines and/or ac-

cording clinical trials or literature, either in the targeted gene or others. Rules

were retrieved by compiling information from FDA labels, ClinicalTrials.gov,

and the literature. Additionally, we also considered biomarkers of primary

resistance from Gene Drug Knowledge Database (Dienstmann et al., 2014),

either in the FDA label or in NCCN guidelines, clinical trials, or case reports.

Moreover, we defined a set of rules to broaden the scope of application of

FDA-approved drugs as different types of repurposing (see the Results). We

manually searched the literature for negative and positive results of clinical

trials essaying the repurposing of opportunities in tier 1 (Table S4).

We also labeled genes as potentially druggable if they were likely to bind

small molecules according to the structure of their protein products and as

potentially biopharmable if they could be targeted with antibodies or protein

therapy (Supplemental Experimental Procedures). All of the information on

the three types of targeting strategies considered, the level of approval of

each agent, their clinical guidelines, and the possibilities to broaden their

scope of application were incorporated into the Cancer Drivers Actionability

Database.

Connecting Therapeutic Agents to Patients

Patients in the cohort were connected to all targeted therapies susceptible of

treating them. To do this, first we enumerated all relevant alterations—in a

sample of the core cohort or mutations in a sample of the full cohort—affecting

Page 15: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

drivers of the tumor type of the sample, based on the information in the Cancer

Drivers Database. Then, all targeted therapies suitable to treat each driver

alteration in the sample were automatically retrieved from the Cancer Drivers

Actionability Database, respecting the specific requirements defined for

each therapeutic agent (of genomic alterations, tumor type prescription, and

drug primary resistances). In the absence of such specific information, we con-

nected the targeted drug to any relevant alteration of the target driver in the

sample in question.

SUPPLEMENTAL INFORMATION

Supplemental Information includes Supplemental Experimental Procedures,

five figures, and five tables and can be found with this article online at http://

dx.doi.org/10.1016/j.ccell.2015.02.007.

AUTHOR CONTRIBUTIONS

D.T. performed the driver detection analysis and identification of major drivers

and cowrote themanuscript. C.R.-P. collectedmost of the information of drug-

target interactions, performed the in silico drug prescription, and cowrote the

manuscript. A.A. and J.M. participated in the collection of drug-target interac-

tions. D.T., C.R.-P., and A.G.-P. did manual curation of drug-target interac-

tions and drug mechanisms. C.P.-L. and J.D.-P. contributed on the driver

detection analysis. M.P.-S. characterized driver genes according to their

role and performed the mutually exclusive analysis. A.G.-P. performed the

pathway analyses. J.D.-P. and M.P.-S. prepared the IntOGen web page to

present the results. A.G.-P. and N.L.-B. oversaw and designed the study

and cowrote the manuscript.

ACKNOWLEDGMENTS

We thank Elias Campo and Caroline Heckman for helpful comments on the

manuscript and Cyriac Kandoth for help in obtaining mutation data from

TCGA. We are grateful to the TCGA Research Network, the International Can-

cer Genome Consortium (ICGC), and members of each individual project for

the tumor genome resequencing data generated. We acknowledge funding

from the Spanish Ministry of Economy and Competitiveness (grant number

SAF2012-36199), La Fundacio la Marato de TV3, and the Spanish National

Institute of Bioinformatics (INB). C.R.-P. and M.P.S. are supported by an FPI

fellowship. D.T. is supported by the People Programme (Marie Curie Actions)

of the Seventh Framework Programme of the European Union (FP7/2007-

2013) under REA grant agreement number 600388 and by the Agency of

Competitiveness for Companies of the Government of Catalonia, ACCIO.

A.G.-P. is supported by a Ramon y Cajal contract.

Received: May 23, 2014

Revised: October 21, 2014

Accepted: February 17, 2015

Published: March 9, 2015

REFERENCES

Al-Lazikani, B., Banerji, U., and Workman, P. (2012). Combinatorial drug

therapy for cancer in the post-genomic era. Nat. Biotechnol. 30, 679–692.

Brooks, A.N., Choi, P.S., de Waal, L., Sharifnia, T., Imielinski, M., Saksena, G.,

Pedamallu, C.S., Sivachenko, A., Rosenberg,M., Chmielecki, J., et al. (2014). A

pan-cancer analysis of transcriptome changes associated with somatic muta-

tions inU2AF1 reveals commonly altered splicing events. PLoSONE9, e87361.

Buller, R.E., Runnebaum, I.B., Karlan, B.Y., Horowitz, J.A., Shahin, M.,

Buekers, T., Petrauskas, S., Kreienberg, R., Slamon, D., and Pegram, M.

(2002). A phase I/II trial of rAd/p53 (SCH 58500) gene replacement in recurrent

ovarian cancer. Cancer Gene Ther. 9, 553–566.

Davoli, T., Xu, A.W., Mengwasser, K.E., Sack, L.M., Yoon, J.C., Park, P.J., and

Elledge, S.J. (2013). Cumulative haploinsufficiency and triplosensitivity drive

aneuploidy patterns and shape the cancer genome. Cell 155, 948–962.

Dienstmann, R., Dong, F., Borger, D., Dias-Santagata, D., Ellisen, L.W., Le,

L.P., and Iafrate, A.J. (2014). Standardized decision support in next generation

sequencing reports of somatic cancer variants. Mol. Oncol. 8, 859–873.

Ferreira, P.G., Jares, P., Rico, D., Gomez, G., Martınez-Trillos, A., Villamor, N.,

Gonzalez-Perez, A., Knowles, D., Monlong, J., Johnson, R., et al. (2013).

Transcriptome characterization by RNA sequencing identifies a major

molecular and clinical subdivision in chronic lymphocytic leukemia. Genome

Res. 24, 212–226.

Fong, P.C., Boss, D.S., Yap, T.A., Tutt, A., Wu, P., Mergui-Roelvink, M.,

Mortimer, P., Swaisland, H., Lau, A., O’Connor, M.J., et al. (2009). Inhibition

of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers.

N. Engl. J. Med. 361, 123–134.

Futreal, P.A., Coin, L., Marshall, M., Down, T., Hubbard, T., Wooster, R.,

Rahman, N., and Stratton, M.R. (2004). A census of human cancer genes.

Nat. Rev. Cancer 4, 177–183.

Garraway, L.A. (2013). Genomics-driven oncology: framework for an emerging

paradigm. J. Clin. Oncol. 31, 1806–1814.

Garraway, L.A., and Lander, E.S. (2013). Lessons from the cancer genome.

Cell 153, 17–37.

Gaulton, A., Bellis, L.J., Bento, A.P., Chambers, J., Davies, M., Hersey, A.,

Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., and Overington,

J.P. (2012). ChEMBL: a large-scale bioactivity database for drug discovery.

Nucleic Acids Res. 40, D1100–D1107.

Ginn, S.L., Alexander, I.E., Edelstein, M.L., Abedi, M.R., and Wixon, J. (2013).

Gene therapy clinical trials worldwide to 2012 - an update. J. Gene Med. 15,

65–77.

Gonzalez-Perez, A., and Lopez-Bigas, N. (2012). Functional impact bias

reveals cancer drivers. Nucleic Acids Res. 40, e169.

Gonzalez-Perez, A., Perez-Llamas, C., Deu-Pons, J., Tamborero, D.,

Schroeder, M.P., Jene-Sanz, A., Santos, A., and Lopez-Bigas, N. (2013a).

IntOGen-mutations identifies cancer drivers across tumor types. Nat.

Methods 10, 1081–1082.

Gonzalez-Perez, A., Mustonen, V., Reva, B., Ritchie, G.R.S., Creixell, P.,

Karchin, R., Vazquez, M., Fink, J.L., Kassahn, K.S., Pearson, J.V., et al.;

International Cancer Genome Consortium Mutation Pathways and

Consequences Subgroup of the Bioinformatics Analyses Working Group

(2013b). Computational approaches to identify functional genetic variants in

cancer genomes. Nat. Methods 10, 723–729.

Gupta, S.C., Sung, B., Prasad, S., Webb, L.J., and Aggarwal, B.B. (2013).

Cancer drug discovery by repurposing: teaching new tricks to old dogs.

Trends Pharmacol. Sci. 34, 508–517.

Halling-Brown, M.D., Bulusu, K.C., Patel, M., Tym, J.E., and Al-Lazikani, B.

(2012). canSAR: an integrated cancer public translational research and drug

discovery resource. Nucleic Acids Res. 40, D947–D956.

Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of cancer: the next gener-

ation. Cell 144, 646–674.

Hopkins, A.L. (2008). Network pharmacology: the next paradigm in drug dis-

covery. Nat. Chem. Biol. 4, 682–690.

Hudson, T.J., Anderson, W., Artez, A., Barker, A.D., Bell, C., Bernabe, R.R.,

Bhan, M.K., Calvo, F., Eerola, I., Gerhard, D.S., et al.; International Cancer

Genome Consortium (2010). International network of cancer genome projects.

Nature 464, 993–998.

Kandoth, C., McLellan, M.D., Vandin, F., Ye, K., Niu, B., Lu, C., Xie, M., Zhang,

Q., McMichael, J.F., Wyczalkowski, M.A., et al. (2013). Mutational landscape

and significance across 12 major cancer types. Nature 502, 333–339.

Khoo, K.H., Verma, C.S., and Lane, D.P. (2014). Drugging the p53 pathway:

understanding the route to clinical efficacy. Nat. Rev. Drug Discov. 13,

217–236.

Lawrence, M.S., Stojanov, P., Polak, P., Kryukov, G.V., Cibulskis, K.,

Sivachenko, A., Carter, S.L., Stewart, C., Mermel, C.H., Roberts, S.A., et al.

(2013). Mutational heterogeneity in cancer and the search for new cancer-

associated genes. Nature 499, 214–218.

Lawrence, M.S., Stojanov, P., Mermel, C.H., Robinson, J.T., Garraway, L.A.,

Golub, T.R., Meyerson, M., Gabriel, S.B., Lander, E.S., and Getz, G. (2014).

Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 395

Page 16: In Silico Prescription of Anticancer Drugs to Cohorts of ...bg.upf.edu/group/img/1-s2.0-S1535610815000574-main.pdfStep two involves finding the drugs targeting the driver protein

Discovery and saturation analysis of cancer genes across 21 tumour types.

Nature 505, 495–501.

Malhotra, D., Portales-Casamar, E., Singh, A., Srivastava, S., Arenillas, D.,

Happel, C., Shyr, C., Wakabayashi, N., Kensler, T.W., Wasserman, W.W.,

and Biswal, S. (2010). Global mapping of binding sites for Nrf2 identifies novel

targets in cell survival response through ChIP-Seq profiling and network anal-

ysis. Nucleic Acids Res. 38, 5718–5734.

McGranahan, N., and Swanton, C. (2015). Biological and Therapeutic Impact

of Intratumor Heterogeneity in Cancer Evolution. Cancer Cell 27, 15–26.

Mermel, C.H., Schumacher, S.E., Hill, B., Meyerson, M.L., Beroukhim, R., and

Getz, G. (2011). GISTIC2.0 facilitates sensitive and confident localization of the

targets of focal somatic copy-number alteration in human cancers. Genome

Biol. 12, R41.

Mestres, J., Gregori-Puigjane, E., Valverde, S., and Sole, R.V. (2008). Data

completeness—the Achilles heel of drug-target networks. Nat. Biotechnol.

26, 983–984.

Oza, A.M., Elit, L., Tsao, M.-S., Kamel-Reid, S., Biagi, J., Provencher, D.M.,

Gotlieb, W.H., Hoskins, P.J., Ghatage, P., Tonkin, K.S., et al. (2011). Phase II

study of temsirolimus in women with recurrent or metastatic endometrial

cancer: a trial of the NCIC Clinical Trials Group. J. Clin. Oncol. 29, 3278–

3285.

Pemovska, T., Kontro, M., Yadav, B., Edgren, H., Eldfors, S., Szwajda, A.,

Almusa, H., Bespalov, M.M., Ellonen, P., Elonen, E., et al. (2013).

Individualized systems medicine strategy to tailor treatments for patients

with chemorefractory acute myeloid leukemia. Cancer Discov 3, 1416–

1429.

Przychodzen, B., Jerez, A., Guinta, K., Sekeres, M.A., Padgett, R.,

Maciejewski, J.P., and Makishima, H. (2013). Patterns of missplicing due to

somatic U2AF1 mutations in myeloid neoplasms. Blood 122, 999–1006.

396 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.

Romero, O.A., and Sanchez-Cespedes, M. (2014). The SWI/SNF genetic

blockade: effects in cell differentiation, cancer and developmental diseases.

Oncogene 33, 2681–2689.

Schroeder, M.P., Rubio-Perez, C., Tamborero, D., Gonzalez-Perez, A., and

Lopez-Bigas, N. (2014). OncodriveROLE classifies cancer driver genes in

loss of function and activating mode of action. Bioinformatics 30, i549–i555.

Stratton, M.R. (2011). Exploring the genomes of cancer cells: progress and

promise. Science 331, 1553–1558.

Tamborero, D., Gonzalez-Perez, A., Perez-Llamas, C., Deu-Pons, J., Kandoth,

C., Reimand, J., Lawrence, M.S., Getz, G., Bader, G.D., Ding, L., and Lopez-

Bigas, N. (2013a). Comprehensive identification of mutational cancer driver

genes across 12 tumor types. Sci. Rep. 3, 2650.

Tamborero, D., Gonzalez-Perez, A., and Lopez-Bigas, N. (2013b).

OncodriveCLUST: exploiting the positional clustering of somatic mutations

to identify cancer genes. Bioinformatics 29, 2238–2244.

Van Allen, E.M., Wagle, N., Stojanov, P., Perrin, D.L., Cibulskis, K., Marlow, S.,

Jane-Valbuena, J., Friedrich, D.C., Kryukov, G., Carter, S.L., et al. (2014).

Whole-exome sequencing and clinical interpretation of formalin-fixed,

paraffin-embedded tumor samples to guide precision cancer medicine. Nat.

Med. 20, 682–688.

Vogelstein, B., Papadopoulos,N., Velculescu, V.E., Zhou,S., Diaz, L.A., Jr., and

Kinzler, K.W. (2013). Cancer genome landscapes. Science 339, 1546–1558.

Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R., Ozenberger, B.A.,

Ellrott, K., Shmulevich, I., Sander, C., Stuart, J.M., Butterfield, Y.S.N., et al.;

Cancer Genome Atlas Research Network (2013). The Cancer Genome Atlas

Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120.

Yoshihara, K., Wang, Q., Torres-Garcia, W., Zheng, S., Vegesna, R., Kim, H.,

and Verhaak, R.G.W. (2014). The landscape and therapeutic relevance of can-

cer-associated transcript fusions. Oncogene. Published online December 15,

2014. http://dx.doi.org/10.1038/onc.2014.406.