pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for...

19
1 Pan-cancer image-based detection 1 of clinically actionable genetic alterations 2 3 Jakob Nikolas Kather 1,2,3 , Lara R. Heij 4,5,6 , Heike I. Grabsch 7,8 , Loes F. S. Kooreman 7 , Chiara Loef- 4 fler 1 , Amelie Echle 1 , Jeremias Krause 1 , Hannah Sophie Muti 1 , Jan M. Niehues 1 , Kai A. J. Sommer 1 , 5 Peter Bankhead 9 , Jefree J. Schulte 10 , Nicole A. Cipriani 10 , Nadina Ortiz-Brüchle 6 , Akash Patnaik 11 , 6 Andrew Srisuwananukorn 12 , Hermann Brenner 2,13,14 , Michael Hoffmeister 13 , Piet A. van den 7 Brandt 15 , Dirk Jäger 2,3 , Christian Trautwein 1 , Alexander T. Pearson 11,* , Tom Luedde 1,16,* 8 9 1 Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany 10 2 German Cancer Consortium (DKTK), Heidelberg, Germany 11 3 Applied Tumor Immunity, German Cancer Research Center (DKFZ), Heidelberg, Germany 12 4 Department of Surgery and Transplantation, University Hospital RWTH Aachen, Aachen, Ger- 13 many 14 5 Department of Surgery, NUTRIM, School of Nutrition and Translational Research in Metabolism, 15 Maastricht University, Maastricht, The Netherlands 16 6 Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany 17 7 Department of Pathology, GROW School for Oncology and Developmental Biology, Maastricht 18 University Medical Center+, Maastricht, The Netherlands 19 8 Pathology & Data Analytics, Leeds Institute of Medical Research at St James's, University of 20 Leeds, Leeds, UK 21 not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was this version posted November 8, 2019. ; https://doi.org/10.1101/833756 doi: bioRxiv preprint

Upload: others

Post on 25-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

1

Pan-cancerimage-baseddetection1

ofclinicallyactionablegeneticalterations2

3

Jakob Nikolas Kather1,2,3, Lara R. Heij4,5,6, Heike I. Grabsch7,8, Loes F. S. Kooreman7, Chiara Loef-4

fler1, Amelie Echle1, Jeremias Krause1, Hannah Sophie Muti1, Jan M. Niehues1, Kai A. J. Sommer1, 5

Peter Bankhead9, Jefree J. Schulte10, Nicole A. Cipriani10, Nadina Ortiz-Brüchle6, Akash Patnaik11, 6

Andrew Srisuwananukorn12, Hermann Brenner2,13,14, Michael Hoffmeister13, Piet A. van den 7

Brandt15, Dirk Jäger2,3, Christian Trautwein1, Alexander T. Pearson11,*, Tom Luedde1,16,* 8

9

1 Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany 10

2 German Cancer Consortium (DKTK), Heidelberg, Germany 11

3 Applied Tumor Immunity, German Cancer Research Center (DKFZ), Heidelberg, Germany 12

4 Department of Surgery and Transplantation, University Hospital RWTH Aachen, Aachen, Ger-13

many 14

5 Department of Surgery, NUTRIM, School of Nutrition and Translational Research in Metabolism, 15

Maastricht University, Maastricht, The Netherlands 16

6 Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany 17

7 Department of Pathology, GROW School for Oncology and Developmental Biology, Maastricht 18

University Medical Center+, Maastricht, The Netherlands 19

8 Pathology & Data Analytics, Leeds Institute of Medical Research at St James's, University of 20

Leeds, Leeds, UK 21

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 2: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

2

9 MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK 22

10 Department of Pathology, University of Chicago Medicine, Chicago, IL, USA 23

11 Department of Medicine, University of Chicago Medicine, Chicago, IL, USA 24

12 Department of Medicine, University of Illinois – Chicago, Chicago, IL, USA 25

13 Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), 26

Heidelberg, Germany 27

14 Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center

for Tumor Diseases (NCT), Heidelberg, Germany 15 Department of Epidemiology, GROW School for Oncology and Developmental Biology, Maas-28

tricht University Medical Center+, Maastricht, The Netherlands 29

16 Division of Gastroenterology, Hepatology and GI Oncology, University Hospital RWTH Aachen, 30

Aachen, Germany 31

* these authors contributed equally to this work 32

Correspondence should be addressed to [email protected], 33

[email protected] and [email protected] 34

35

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 3: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

3

Precision treatment of cancer relies on genetic alterations which are diagnosed by molecular 36

biology assays.1 These tests can be a bottleneck in oncology workflows because of high turna-37

round time, tissue usage and costs.2 Here, we show that deep learning can predict point muta-38

tions, molecular tumor subtypes and immune-related gene expression signatures3,4 directly 39

from routine histological images of tumor tissue. We developed and systematically optimized 40

a one-stop-shop workflow and applied it to more than 4000 patients with breast5, colon and 41

rectal6, head and neck7, lung8,9, pancreatic10, prostate11 cancer, melanoma12 and gastric13 can-42

cer. Together, our findings show that a single deep learning algorithm can predict clinically ac-43

tionable alterations from routine histology data. Our method can be implemented on mobile 44

hardware14, potentially enabling point-of-care diagnostics for personalized cancer treatment 45

in individual patients. 46

Clinical guidelines recommend molecular testing of tumor tissue for most patients with advanced 47

solid tumors. However, in most tumor types, routine testing includes only a handful of altera-48

tions, such as KRAS, NRAS, BRAF mutations and microsatellite instability (MSI) in colorectal can-49

cer. While new studies identify more and more molecular features of potential clinical relevance, 50

current diagnostic workflows are not designed to incorporate an exponentially rising load of 51

tests. For example, in colorectal cancer, previous studies have identified consensus molecular 52

subtypes (CMS) as a candidate biomarker, but sequencing costs preclude widespread testing. 53

While comprehensive molecular and genetic tests are hard to implement at scale, histological 54

images stained with hematoxylin and eosin (H&E) are ubiquitously available. We hypothesized 55

that these routine images contain information about established and candidate biomarkers and 56

thus could be used for rapid pre-screening of patients, potentially alleviating the load of molec-57

ular assays. To test this, we developed, optimized and validated a deep learning algorithm to 58

determine molecular features directly from histology images. Deep learning with convolutional 59

neural networks has been used for tissue segmentation in cancer histology15-17 or detecting mo-60

lecular changes in circumscribed use cases in a single tumor type18-22, but our aim was to use 61

deep learning in a pan-molecular pan-cancer approach. Our method is a ‘one-stop-shop’ work-62

flow: we collected large patient cohorts for individual tumor types, partitioning each cohort into 63

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 4: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

4

three groups for cross-validation (Fig. 1a). Whole slide images were tessellated into an image 64

library of smaller tiles20,21 which were used for deep transfer learning (Fig. 1b). We chose predic-65

tion of microsatellite instability (MSI) in colorectal cancer as a clinically relevant benchmark task20 66

and sampled a large hyperparameter space with different commonly used deep learning mod-67

els16,18,20,21. Unexpectedly, ‘inception’23 and ‘resnet’24 networks, which had been the previous de-68

facto standard, were markedly outperformed by ‘densenet’25 and ‘shufflenet’14 architectures, the 69

latter demonstrating high accuracy at a low training time (raw data in Suppl. Table 1, N=426 pa-70

tients in the “Cancer Genome Atlas” [TCGA] cohort). Shufflenet is optimized for mobile devices, 71

making this deep neural network architecture attractive for decentralized point-of-care image 72

analyses or direct implementation in microscopes26. We trained a shufflenet on N=426 patients 73

in the TCGA-CRC cohort20 and validated it on N=379 patients in the DACHS cohort20 cohort, reach-74

ing an AUC of 0.89 [0.88; 0.92] (Fig. 1d). This represents a marked improvement over the previous 75

best performance of 0.84 in that dataset20. Subsequently, we tested the full workflow in breast 76

cancer for detection of standard molecular pathology features which are usually measured by 77

immunohistochemistry: Estrogen [ER] and progesterone [PR] receptor status and HER2 status 78

were highly significantly detectable from histology alone, reaching AUCs of up to 0.82 in a three-79

fold patient-level cross-validation (Fig. 1e). 80

Having optimized our method in these use cases, we applied it to more than 4000 patients across 81

ten of the most prevalent solid tumor types from the TCGA reference database. We aimed to 82

predict all clinically and/or biologically relevant mutations with a prevalence above 2% and af-83

fecting at least four patients. The list of candidate mutations (Suppl. Table 2) also included all 84

point mutations targetable by FDA-approved drugs (www.oncokb.org). We found that in multiple 85

major cancer types, the genotype of point mutations was predictable directly from images. For 86

example, in lung adenocarcinoma (TCGA-LUAD8, N=464 patients), significant AUCs were achieved 87

for TP53 mutational status (AUC 0.71, Fig. 2a) and EGFR mutational status (AUC 0.60), which is 88

targetable by clinically approved treatments. Also in colon and rectal cancer (TCGA-COAD and 89

TCGA-READ27, N=590 patients), standard-of-care genetic biomarkers28 BRAF (AUC 0.66) and KRAS 90

(AUC 0.60) were significantly detectable, as were oncogenic driver mutations linked to tumor 91

aggressiveness, including CDC2729 (AUC 0.70, Fig. 2b). Similarly, in breast cancer (TCGA-BRCA5, 92

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 5: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

5

N=1007 patients), gene mutations of TP53 (AUC 0.75), MAP2K4 (which is a potential biomarker 93

for response to MEK inhibitors30, AUC 0.66) as well as PIK3CA (which is directly targetable by a 94

small molecule inhibitor31, AUC 0.63) were significantly detectable (Fig. 2c). In gastric cancer 95

(TCGA-STAD13, N=363 patients), mutations of MTOR – a candidate for targeted treatment32 – 96

were significantly detectable with a high AUC of 0.80 (Fig. 2d) as were a range of driver mutations 97

including BRCA2 (AUC 0.67), PTEN (AUC 0.66), PIK3CA (AUC 0.65) among others. In head and neck 98

squamous cell carcinoma (TCGA-HNSC7, N=424 patients), genotype of CASP8, which is linked to 99

resistance to cell death33, was significantly detected with a high AUC of 0.72 (Suppl. Fig. 1a). In 100

other tumor types such as melanoma (TCGA-SKCM12, N=429 patients), or lung squamous cell car-101

cinoma (TCGA-LUSC9, N=412 patients), few mutations were significantly detected (Suppl. Fig. 1b-102

c). Lung squamous cell carcinoma is known for its difficulty in molecular diagnosis and few mo-103

lecularly or genetically targeted treatment options even in clinical trials. Thus, it is plausible that 104

tumor histomorphology was not well correlated to mutations. In pancreatic adenocarcinoma 105

(TCGA-PAAD10, N=166 patients), identifying KRAS wild type patients is of high clinical relevance 106

because these patients are potential candidates for targeted treatment. Our method significantly 107

identified KRAS genotype with AUC 0.66 (Suppl. Fig. 1d). Lastly, in prostate cancer (TCGA-PRAD11, 108

N=402 patients), our method detected targetable mutations from histology – most remarkably 109

PIK3CA, which was significantly detected with an AUC of 0.75 (Suppl. Fig. 1e). Furthermore, 110

CDK12, which is linked to immune evasion in prostate cancer34 was detected with an AUC of 0.71. 111

Together, these data show that deep learning can detect a wide range of targetable and poten-112

tially targetable point mutations directly from histology across multiple prevalent tumor types. 113

Next, we applied our method to a broader set of molecular signatures beyond single mutations. 114

We chose features with known biological and potential clinical significance which are currently 115

not part of clinical guidelines in most solid tumors. A major group of such features are immune-116

related gene expression signatures3 of CD8-positive lymphocytes, macrophages, proliferation, in-117

terferon-gamma (IFNg) signaling and transforming growth factor beta (TGFb) signaling. These bi-118

ological processes are involved in response to cancer treatment, including immunotherapy. De-119

tecting their morphological correlates in histology images could facilitate the development of 120

more nuanced treatment strategies. Indeed, in lung adenocarcinoma signatures of proliferation, 121

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 6: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

6

macrophage infiltration and T-lymphocyte infiltration were significantly detectable from images 122

with high AUCs (Fig. 3a). Similarly, significant AUCs for these biomarkers were achieved in colo-123

rectal cancer (Fig. 3b), breast cancer (Fig. 3d) and gastric cancer (Fig. 3d). In gastric cancer, we 124

additionally investigated a signature of stem cell properties (stemness) which was highly detect-125

able in images (AUC 0.76, Fig. 3d). Recent studies have clustered tumors into comprehensive 126

‘immune subtypes’3, but again this classification system relies on deep molecular profiling una-127

vailable in a clinical setting. We found that our method could detect these immune subtypes with 128

up to AUC 0.75 in lung adenocarcinoma (Fig. 3a), up to AUC 0.72 in colorectal cancer (Fig. 3b) 129

and up to AUC 0.71 in breast cancer (Fig. 3c). Together, these findings show that immunological 130

processes that are quantifiable by molecular profiling are also accessible to deep-learning-based 131

histology image analysis. 132

Finally, we investigated the use of deep learning on conserved molecular classes of tumors such 133

as recently identified TCGA subtypes3, pan-gastrointestinal subtypes4 and consensus molecular 134

subtypes of colorectal cancer6. Few of these classification systems are currently incorporated in 135

clinical workflows, mainly because of the high cost and logistic effort associated with sequencing 136

technology. In our experiments, TCGA molecular subtypes LUAD1-6 were highly detectable in 137

histology images of lung adenocarcinoma (Fig. 3a) with AUCs of up to 0.74. In colorectal cancer 138

(Fig. 3b) and gastric cancer (Fig. 3d), the pan-gastrointestinal (GI) subtypes GI-hypermutated-139

indel (GI-HM-indel), GI genome stable (GI-GS), GI-chromosomally instable (GI-CIN), GI-hypermu-140

tated-single-nucleotide variant predominant (GI-HM-SNV) and GI Epstein-Barr-Virus-positive (GI-141

EBV) were significantly detectable from histology. Correspondingly, in colorectal cancer, ‘consen-142

sus molecular subtypes’6 were detectable by deep learning (Fig. 3b). These findings could open 143

up fundamentally new options for clinical trials of cancer: While accumulating evidence shows 144

that molecular clusters of tumors are correlated to biologically and clinical outcome, deep mo-145

lecular classification of these tumors is usually not available to patients in clinical routine or to 146

patients within clinical trials. Detecting these subtypes merely from histology would immediately 147

allow for these subtypes to be analyzed in clinical trials directly from routine material, potentially 148

helping to identify new biomarkers for treatment response. A full description of the methods is 149

available in the “Extended Methods” section. 150

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 7: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

7

Together, our results demonstrate the feasibility of pan-cancer deep learning image-based test-151

ing. We show that a unified workflow yields reliably high performance across multiple clinically 152

relevant scenarios. Compared to conventional genetic tests, our methodology enables detailed 153

prediction of the spatial heterogeneity of genotypes which is not possible in molecular bulk test-154

ing of tumor tissue. An example of this visualization is shown in (Fig. 4a-g): Based only on a rou-155

tine histological image of colorectal cancer (Fig. 4a), deep learning classifiers correctly predicted 156

CDC27 mutational status (Fig. 4b-c) and consensus molecular subtype (Fig. 4d-g) with a high prob-157

ability, while assigning a low probability to competing classes. 158

Image-based genotyping could be used for definitive testing once performance surpasses previ-159

ous tests, potentially disrupting clinical workflows Suppl. Fig. 3a-c. A limitation of our method is 160

the low AUC values for some molecular features, but re-training on larger cohorts with up to 161

10,000 patients per tumor type is expected to increase performance.16 Another limitation is that 162

for very unbalanced features – for scarce molecular features – the uncertainty of the AUC esti-163

mate is high. Thus, before clinical implementation, multicenter validation is essential, requiring 164

collaborative efforts. Together, our results show that deep learning can consistently unlock 165

dormant patterns in widely available histology images, potentially improving current workflows 166

for molecularly targeted therapy of cancer. 167

168

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 8: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

8

169

Fig. 1: Transfer learning workflow for histology images. (a) Patient cohorts are split into three 170 partitions for cross-validation of deep classifiers (b) Pre-trained networks re re-trained with only 171 the deepest layers trainable, speeding up computation while enabling state-of-the-art perfor-172 mance. (c) A hyperparameter sweep with multiple networks shows that shufflenet consistently 173 yields high accuracy and speed for detection of microsatellite instability (MSI) in colorectal cancer 174 (N=426 patients), raw data in Suppl. Table 1. (d) External validation of the best shufflenet on the 175 DACHS cohort (N=379 patients). (e) Validation of the workflow by prediction of estrogen receptor 176 (ER), progesterone receptor (PR), HER2 status and tumor mutational burden (TMB) in breast can-177 cer, assessed by cross-validated area under the receiver operating curve (AUC). 178

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 9: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

9

179

Fig. 2: Prediction of point mutations directly from histological images. Deep networks predicted 180 genotype directly from histological images in (a) lung adenocarcinoma, (b) colorectal, (c) breast 181 cancer and (d) gastric cancer. Patient cohorts were randomly split for cross validation and classi-182 fiers were assessed by the area under the receiver operating curve (AUC, horizontal axis) with a 183 95% bootstrapped confidence interval. Genotype was predicted from histology with a high AUC 184 for multiple clinically actionable mutations. (*) denotes all cases where the lower confidence 185 bound exceeds a random classifier (AUC 0.5). “n” denotes the number of patients. Mutations 186 with an AUC<0.55 are not shown. For a full list of all tested alterations, see Suppl. Table 2. 187

188

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 10: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

10

189 Fig. 3: Prediction of gene expression signatures directly from histology. Deep networks were 190 trained to predict clinically relevant gene expression signatures directly from histological images 191 in (a) lung adenocarcinoma, (b) colorectal, (c) breast cancer and (d) gastric cancer. Classifiers 192 were assessed by the cross-validated area under the receiver operating curve with bootstrapped 193 confidence intervals (AUC under ROC, horizontal axis). Continuous signatures were binarized at 194 the mean. Variables with an average AUC<0.55 are not shown. (*) denotes all cases where the 195 lower confidence bound exceeds a random classifier (AUC 0.5). “n” denotes the number of pa-196 tients. For a full list of all tested alterations, see Suppl. Table 2. “subtype” denotes TCGA molec-197 ular subtypes. 198

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 11: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

11

199

200

Fig. 4 Multiplex genotype maps with local predictability uncovered by deep learning. (a) A 201 whole slide image of a colorectal cancer from the TCGA cohort was used for genotype prediction 202 by deep learning classifiers. (b) A prediction map for CDC27 wild type status and (c) a prediction 203 map for CDC27 mutated status, correctly predicting that this particular patient is mutated. Simi-204 larly, prediction maps for consensus molecular subtype (CMS) classes (d) CMS1, (e) CMS2, (f) 205 CMS3 and (g) CMS4 correctly show that deep learning robustly predicts CMS from histology alone 206 while highlighting potential intratumor heterogeneity. 207

208

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 12: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

12

Funding209

The results are in part based upon data generated by the TCGA Research Network: http://can-210

cergenome.nih.gov/. Our funding sources are as follows. J.N.K.: RWTH University Aachen (START 211

2018-691906). A.T.P.: NIH/NIDCR (#K08-DE026500), Institutional Research Grant (#IRG-16-222-212

56) from the American Cancer Society, and the University of Chicago Medicine Comprehensive 213

Cancer Center Support Grant (#P30-CA14599). T.L.: Horizon 2020 through the European Research 214

Council (ERC) Consolidator Grant PhaseControl (771083), a Mildred-Scheel-Endowed Professor-215

ship from the German Cancer Aid (Deutsche Krebshilfe), the German Research Foundation (DFG) 216

(SFB CRC1382/P01, SFB-TRR57/P06, LU 1360/3-1), the Ernst-Jung-Foundation Hamburg and the 217

IZKF (interdisciplinary center of clinical research) at RWTH Aachen. 218

Authorcontributions219

JNK, ATP and TL designed the study. LH, HIG, NAC, JJS, PAVDB, LFSK and AP oversaw the tumor 220

annotation. CL, AE, JK, HSM, JMN and KAJS manually annotated all tumors. JNK, JK, JMN and PB 221

designed and implemented the algorithm. JNK, CL, AS and NOB curated the list of molecular al-222

terations. HB and MH provided samples from the DACHS study and gave statistical advice. CT, DJ, 223

ATP and TL provided infrastructure and supervised the study. All authors contributed to the data 224

analysis and to writing the manuscript. 225

Conflictsofinterest226

The authors declare that no conflict of interest exists. 227

228

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 13: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

13

Extendedmethods229

All experiments were conducted in accordance with the Declaration of Helsinki and the Interna-230

tional Ethical Guidelines for Biomedical Research Involving Human Subjects. Anonymized 231

scanned whole slide images were retrieved from The Cancer Genome Atlas (TCGA) project 232

through the Genomics Data Commons Portal (https://portal.gdc.cancer.gov/). Tissue samples 233

from the DACHS trial35,36 were retrieved from the tissue bank of the National Center for Tumor 234

diseases (NCT, Heidelberg, Germany) as described before.20 235

Scanned whole slide images of tissue slides stained with hematoxylin and eosin were acquired in 236

SVS format. Magnification was between 20x and 40x and corresponding resolution was between 237

0.25 and 0.51 micrometers per pixel (µm/px). All images were manually reviewed by a trained 238

observer who discussed non-trivial cases with an expert pathologist. After review by the expert 239

pathologist, only those images with tumor tissue on slide were used for downstream analysis. 240

The observer manually delineated tumor tissue on the slide which in most cases included more 241

than half of the total tissue. This region was then tessellated into square tiles of 256 µm edge 242

length. For the benchmark task, these images were resized 1.14 µm/ pixel to be consistent with 243

a previous study20; for all subsequent tasks, images were processed at 0.5 µm/pixel. Some pa-244

tients in the TCGA archive had more than one slide per patient and in these cases, tiles from all 245

slides were pooled on a per-patient basis. From every slide, only a subset of tiles was used for 246

neural network training and prediction (default 1000 tiles per slide; values explored in hyperpa-247

rameter sampling: 250, 500 and 750). A target variable (e.g. a particular mutation) was matched 248

to each patient (see below) and all tiles corresponding to that patient inherited the label. The 249

patient cohort was then randomly split in three parts in such a way that each part contained 250

approximately the same number of patients with each label. These three parts of the patient 251

cohort were then used for three-fold patient-level cross-validation. Before training, each cohort 252

was randomly undersampled in such a way that the number of tiles per label was identical for 253

each label. For training, we used on-the-fly data augmentation (random x-y-reflection and ran-254

dom horizontal and vertical shear of 5 px). No color normalization was used. 255

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 14: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

14

Molecular labels are listed in Suppl. Table 2 and were retrieved from the following sources: Basic 256

clinical and pathological data was retrieved through http://portal.gdc.cancer.gov. Mutational 257

status (wild type or mutated) and high-level amplification were acquired through http://cbiopor-258

tal.org. In that database, we used “PanCancerAtlas” or “TCGA Provisional” project, whichever 259

contained more patients in that particular tumor type. High-level data on gene expression signa-260

tures was retrieved from Thorsson et al. (10). For breast and endometrial cancer, additional data 261

on tumor subtypes were retrieved from Berger et al. (27). For gastric and colorectal cancer, tumor 262

subtype data was retrieved from Liu et al. (11). 263

Hyperparameter selection was performed for five deep neural networks which were pre-trained 264

on ImageNet: resnet18, alexnet, inceptionv3, densenet201 an shufflenet. The sampled hyperpa-265

rameter space was as follows: learning rate (fixed) 5e-5 and 1e-4, maximum number of tiles per 266

whole slide image: 250, 500 and 750, number of hot layers (Fig. 1b): 10, 20 and 30. The number 267

of epochs was four with a mini batch size of 512, similar to previous experiments.20 268

All algorithms for whole slide image processing, including tessellation of images and visualization 269

of spatial activation maps, were implemented in QuPath v0.1.2 in Groovy 270

(http://qupath.github.io). All deep learning algorithms, including training and prediction, were 271

implemented in Matlab R2018b (Mathworks, Natick, MA, USA). 272

All images from the TCGA cohort are available at https://portal.gdc.cancer.gov/ . All source codes 273

are available at https://github.com/jnkather/DeepHistology 274

275

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 15: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

15

Bibliography276

1. Cheng, M.L., Berger, M.F., Hyman, D.M. & Solit, D.B. Clinical tumour sequencing for precision 277 oncology: time for a universal strategy. Nature Reviews Cancer 18, 527-528 (2018). 278

2. Rusch, M., et al. Clinical cancer genomic profiling by three-platform sequencing of whole genome, 279 whole exome and transcriptome. Nature Communications 9, 3962 (2018). 280

3. Thorsson, V., et al. The Immune Landscape of Cancer. Immunity 48, 812-830.e814 (2018). 281 4. Liu, Y., et al. Comparative Molecular Analysis of Gastrointestinal Adenocarcinomas. Cancer Cell 282

33, 721-735.e728 (2018). 283 5. The Cancer Genome Atlas Network, et al. Comprehensive molecular portraits of human breast 284

tumours. Nature 490, 61 (2012). 285 6. Guinney, J., et al. The consensus molecular subtypes of colorectal cancer. Nature Medicine 21, 286

1350 (2015). 287 7. The Cancer Genome Atlas Network, et al. Comprehensive genomic characterization of head and 288

neck squamous cell carcinomas. Nature 517, 576 (2015). 289 8. The Cancer Genome Atlas Network, et al. Comprehensive molecular profiling of lung 290

adenocarcinoma. Nature 511, 543 (2014). 291 9. Hammerman, P.S., et al. Comprehensive genomic characterization of squamous cell lung cancers. 292

Nature 489, 519-525 (2012). 293 10. The Cancer Genome Atlas Network. Integrated Genomic Characterization of Pancreatic Ductal 294

Adenocarcinoma. Cancer Cell 32, 185-203.e113 (2017). 295 11. The Cancer Genome Atlas Network. The Molecular Taxonomy of Primary Prostate Cancer. Cell 296

163, 1011-1025 (2015). 297 12. Cancer Genome Atlas, N. Genomic Classification of Cutaneous Melanoma. Cell 161, 1681-1696 298

(2015). 299 13. The Cancer Genome Atlas Network, et al. Comprehensive molecular characterization of gastric 300

adenocarcinoma. Nature 513, 202 (2014). 301 14. Zhang, X., Zhou, X., Lin, M. & Sun, J. Shufflenet: An extremely efficient convolutional neural 302

network for mobile devices. in Proceedings of the IEEE Conference on Computer Vision and Pattern 303 Recognition 6848-6856 (2018). 304

15. Kather, J.N., et al. Predicting survival from colorectal cancer histology slides using deep learning: 305 A retrospective multicenter study. PLOS Medicine 16, e1002730 (2019). 306

16. Campanella, G., et al. Clinical-grade computational pathology using weakly supervised deep 307 learning on whole slide images. Nature Medicine (2019). 308

17. Janowczyk, A. & Madabhushi, A. Deep learning for digital pathology image analysis: A 309 comprehensive tutorial with selected use cases. J Pathol Inform 7, 29 (2016). 310

18. Coudray, N., et al. Classification and mutation prediction from non-small cell lung cancer 311 histopathology images using deep learning. Nature Medicine 24, 1559-1567 (2018). 312

19. Schaumberg, A.J., Rubin, M.A. & Fuchs, T.J. H&amp;E-stained Whole Slide Image Deep Learning 313 Predicts SPOP Mutation State in Prostate Cancer. bioRxiv, 064279 (2018). 314

20. Kather, J.N., et al. Deep learning can predict microsatellite instability directly from histology in 315 gastrointestinal cancer. Nature Medicine (2019). 316

21. Kather, J.N., et al. Deep learning detects virus presence in cancer histology. bioRxiv, 690206 317 (2019). 318

22. Kim, R.H., et al. A Deep Learning Approach for Rapid Mutational Screening in Melanoma. bioRxiv, 319 610311 (2019). 320

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 16: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

16

23. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture 321 for computer vision. in Proceedings of the IEEE conference on computer vision and pattern 322 recognition 2818-2826 (2016). 323

24. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. in Proceedings of 324 the IEEE conference on computer vision and pattern recognition 770-778 (2016). 325

25. Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K.Q. Densely connected convolutional 326 networks. in Proceedings of the IEEE conference on computer vision and pattern recognition 4700-327 4708 (2017). 328

26. Chen, P.C., et al. An augmented reality microscope with real-time artificial intelligence integration 329 for cancer diagnosis. Nature Medicine (2019). 330

27. The Cancer Genome Atlas Network, et al. Comprehensive molecular characterization of human 331 colon and rectal cancer. Nature 487, 330 (2012). 332

28. Kather, J.N., Halama, N. & Jaeger, D. Genomics and emerging biomarkers for immunotherapy of 333 colorectal cancer. Seminars in Cancer Biology 52, 189-197 (2018). 334

29. Qiu, L., et al. CDC27 Induces Metastasis and Invasion in Colorectal Cancer via the Promotion of 335 Epithelial-To-Mesenchymal Transition. J Cancer 8, 2626-2635 (2017). 336

30. Xue, Z., et al. MAP3K1 and MAP2K4 mutations are associated with sensitivity to MEK inhibitors in 337 multiple cancer models. Cell Research 28, 719-729 (2018). 338

31. André, F., et al. Alpelisib for PIK3CA-Mutated, Hormone Receptor–Positive Advanced Breast 339 Cancer. New England Journal of Medicine 380, 1929-1940 (2019). 340

32. Fukamachi, H., et al. A subset of diffuse-type gastric cancer is susceptible to mTOR inhibitors and 341 checkpoint inhibitors. Journal of Experimental & Clinical Cancer Research 38, 127 (2019). 342

33. Li, C., Egloff, A.M., Sen, M., Grandis, J.R. & Johnson, D.E. Caspase-8 mutations in head and neck 343 cancer confer resistance to death receptor-mediated apoptosis and enhance migration, invasion, 344 and tumor growth. Molecular oncology 8, 1220-1230 (2014). 345

34. Wu, Y.-M., et al. Inactivation of <em>CDK12</em> Delineates a Distinct Immunogenic Class of 346 Advanced Prostate Cancer. Cell 173, 1770-1782.e1714 (2018). 347

35. Hoffmeister, M., et al. Statin use and survival after colorectal cancer: the importance of 348 comprehensive confounder adjustment. J Natl Cancer Inst 107, djv045 (2015). 349

36. Brenner, H., Chang-Claude, J., Seiler, C.M. & Hoffmeister, M. Long-term risk of colorectal cancer 350 after negative colonoscopy. J Clin Oncol 29, 3761-3767 (2011). 351

352

353

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 17: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

17

SupplementaryFigures354

355

356

Suppl. Fig. 1: Mutation prediction from histology in additional tumor types. Our method signif-357 icantly predicted oncogenic mutations from histology in (a) Head and neck squamous cell cancer, 358 (b) Melanoma, (c) Lung squamous cell carcinoma, (d) Pancreatic cancer and (e) Prostate cancer. 359 The horizontal axis shows three-fold cross-validated area under the receiver operating curve 360 (AUC) as mean +/- 95% bootstrapped confidence interval. 361

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 18: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

18

362

Suppl. Fig. 2: Prediction of high-level gene expression signatures in additional tumor types. Our 363 method significantly predicted high level gene expression signatures from histology in (a) Head 364 and neck squamous cell cancer, (b) Melanoma, (c) Lung squamous cell carcinoma, (d) Pancreatic 365 cancer and (e) Prostate cancer. The horizontal axis shows three-fold cross-validated area under 366 the receiver operating curve (AUC) as mean +/- 95% bootstrapped confidence interval. 367

368

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint

Page 19: Pan-cancer image-based detection of clinically actionable ... · 4 64 three groups for cross-validation (Fig. 1a).Whole slide images were tessellated into an image 65 library of smaller

19

369

Suppl. Fig. 3: Proposed clinical workflow. (a) Starting with ubiquitously available routine histol-370 ogy slides, our method relies on a tessellation of digitized images (“image library preparation”) 371 which are passed to a deep convolutional neural network. The network predicts features on a 372 tile level and the predictions are pooled on a patient level. (b) Histology-based testing can be 373 applied to standard of care pathological biomarkers, driver mutations, and other features such 374 as tumor expression subtypes. (c) We suggest that clinically meaningful findings of deep learn-375 ing networks could be discussed in a tumor board, validated by orthogonal methods and ulti-376 mately guide targeted treatment. 377

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted November 8, 2019. ; https://doi.org/10.1101/833756doi: bioRxiv preprint