Download - Oncology oracle
Cell lines are a rich source of drug sensitivity data,enables the training of 98 individual drug models
39, 571 Unique Drug - Cell line Sensitivity Values
Predict
675 Human cancer cell lines withRNA-seq Gene Expression
Ridge regression keeps only a few significant genesin model
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance Determnation Regression tunesmodel sparsity from data
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance Determination Regression
Start with only the most highly varying genes
8000 of the highest varying genes in the Genentech data were chosen as features to reduce noise and for computational tractability
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance Determination Regression
Docetaxel model separates sensitive and resistantbreast cancer patients with 0.87 ROC AUC
False Positive Rate
True PositiveRate
Increasing Predicted Docetaxel Sensitivity
91% of sensitive patients treated85% of resistant patients avoid unnecessary treatment
About Me
©20
14 N
atur
e A
mer
ica,
Inc.
All
right
s re
serv
ed.
T E C H N I C A L R E P O R T S
Functional interpretation of genomic variation is critical to understanding human disease, but it remains di�cult to predict the e�ects of speci�c mutations on protein interaction networks and the phenotypes they regulate. We describe an analytical framework based on multiscale statistical mechanics that integrates genomic and biophysical data to model the human SH2-phosphoprotein network in normal and cancer cells. We apply our approach to data in The Cancer Genome Atlas (TCGA) and test model predictions experimentally. We �nd that mutations mapping to phosphoproteins often create new interactions but that mutations altering SH2 domains result almost exclusively in loss of interactions. Some of these mutations eliminate all interactions, but many cause more selective loss, thereby rewiring speci�c edges in highly connected subnetworks. Moreover, idiosyncratic mutations appear to be as functionally consequential as recurrent mutations. By synthesizing genomic, structural and biochemical data, our framework represents a new approach to the interpretation of genetic variation.
TCGA and similar projects have generated extensive data on the muta -tional landscape of tumors1. To understand the functional consequences of these mutations, it is necessary to ascertain how they alter the pro-tein-protein interaction (PPI) networks involved in regulating cellular
Low-throughput methods such as �uorescence polarization spec-troscopy provide precise interaction data on a few dozen PIDs and ligands but cannot easily be scaled to the full proteome2, whereas high-throughput array-based methods provide greater scale but su�er from systematic artifacts and high false positive and false negative rates, resulting in data sets that only partly agree 2. An additional challenge is modeling the e�ects of mutations for pro-teins with multiple binding domains and/or multiple sites of phos-phorylation9, a reality for most signaling proteins (for example, the CRK oncoprotein). Existing methods are either limited to individual domains10–13 or are insu�ciently precise to discern the e�ects of single- residue changes14,15.
�e MSM framework we have developed combines genomic, bind-ing and structural data and reconciles inconsistencies within and among data sets to generate PID networks for normal and cancer cells. We develop a bottom-up �rst-principles approach, involving a single mathematical equation based on statistical mechanical ensem-bles, that models domains, proteins and networks, and we then apply this approach to the analysis of SH2 networks and mutations found in TCGA16. We validate newly predicted interactions experimentally and demonstrate the sensitivity of MSM to single-residue mutations that cause subtle changes in binding a�nity. Our analysis provides mechanistic insights into an important PID cancer network and vali-dates a computational approach to PID networks that can be applied
A multiscale statistical mechanical framework integrates biophysical and genomic data to assemble cancer networksMohammed AlQuraishi1–3, Grigoriy Koytiger1,3, Anne Jenney1, Gavin MacBeath2 & Peter K Sorger1
PTPN11 (SHP2)GRB2
STAT
5AST
AT6
STAT
1ST
AT4
STAT
2ST
AT3
STAP
1TE
NC1
TNS1
TNS3
TNS4
RIN1
SH2D5
SH2B1
SH2B2
SH2B3
SHC1
SHC3
PLCG1
PLCG2
BCAR3SH2D3CSH2D3AFERFES
CRKCRKLMATKHSH2DSH2D2ASHBSHFSHDSHEFGRYES1SRCHCKLCKLYNBLKSLA2
PTK6
PTPN6
PTPN11
ABL1G
RAPG
RB2N
CK
1N
CK
2PIK
3R1
PIK3R
3PIK
3R2
2N
HC
ITK
TXK
BTKTE
CBM
XSY
K
ZAP7
0
GRB
7
GRB1
4
GRB1
0
SH2D
1A
SH2D
1B
INPPL1BLNK
LCP2DAPP1VAV1VAV3VAV2RASA1SH3BP2
JAK2JAK3
SUPT6H
CBL
ANKS1A
ANKS1B
GULP1NUMB
NUMBL
DAB1
DAB2
APBA1
APBA2
APBA3
CCM2
DOK1
DOK2DOK4
DOK6DOK5
FRS3IRS1
IRS4APBB1
APBB2A
PBB
3SH
C2
APPL1
EPS8L2STAT
5AST
AT6
STAT
1ST
AT4
STAT
2ST
AT3
STAP
1TE
NC1
TNS1
TNS3
TNS4
RIN1
SH2D5
SH2B1
SH2B2
SH2B3
SHC1
SHC3
PLCG1
PLCG2
BCAR3SH2D3CSH2D3AFERFES
CRKCRKLMATKHSH2DSH2D2ASHBSHFSHDSHEFGRYES1SRCHCKLCKLYNBLKSLA2
PTK6
PTPN6
PTPN11
ABL1G
RAPG
RB2N
CK
1N
CK
2PIK
3R1
PIK3R
3PIK
3R2C
HN
2IT
KTX
KB
TKTEC
BMX
SYK
ZAP7
0
GRB
7
GRB1
4
GRB1
0
SH2D
1A
SH2D
1B
INPPL1BLNKLCP2
DAPP1VAV1VAV3VAV2RASA1SH3BP2JAK2
JAK3
SUPT6H
CBL
ANKS1A
ANKS1B
GULP1NUMB
NUMBL
DAB1
DAB2
APBA1
APBA2
APBA3
CCM2
DOK1
DOK2DOK4
DOK6DOK5
FRS3IRS1
IRS4APBB1
APBB2A
PBB
3SH
C2
APPL1
EPS8L2
Ph.D. Harvard UniversityDepartment of Chemistry and Chemical Biology
Postdoc Harvard Medical SchoolDepartment of Systems Biology
Management ConsultantDean & Co.
Drug Train Size Spearman r Spearman p17-AAG 401 0.35 0.00681640 282 0.17 0.01A-443654 107 0.04 0.66A-770041 107 0.30 0.00ABT-263 294 0.15 0.01ABT-888 296 0.09 0.14AEW541 281 0.14 0.02AG-014699 293 0.27 0.00AICAR 296 0.29 0.00AKT inhibitor VIII 301 0.25 0.00AMG-706 294 0.20 0.00AP-24534 301 0.17 0.00AS601245 300 0.12 0.03ATRA 296 0.20 0.00AUY922 300 0.09 0.12AZ628 106 0.30 0.00AZD-0530 109 0.13 0.17AZD-2281 296 0.21 0.00AZD0530 282 0.27 0.00AZD6244 394 0.17 0.00AZD6482 283 0.21 0.00AZD7762 296 0.25 0.00AZD8055 292 0.24 0.00Afatinib 240 0.18 0.01Axitinib 296 0.18 0.00BAY 61-3606 300 0.22 0.00BI-2536 107 0.03 0.73BIBW2992 296 0.33 0.00BIRB 0796 294 0.14 0.02BMS-509744 107 0.17 0.09BMS-536924 107 0.41 0.00BMS-708163 294 0.19 0.00BMS-754807 300 0.12 0.04BX-795 293 0.37 0.00Bexarotene 303 0.11 0.06Bicalutamide 306 0.16 0.01Bleomycin 300 0.38 0.00Bortezomib 122 0.00 0.96Bosutinib 298 0.25 0.00
Drug Train Size Spearman rSpearman pBryostatin 1 300 0.13 0.02CCT007093 294 0.05 0.37CCT018159 287 0.21 0.00CEP-701 296 0.25 0.00CGP-082996 107 0.25 0.01CGP-60474 107 -0.07 0.49CHIR-99021 301 0.05 0.36CI-1040 294 0.49 0.00CMK 107 0.08 0.40Camptothecin 296 0.24 0.00Cisplatin 296 0.14 0.01Cyclopamine 106 0.13 0.18Cytarabine 299 0.28 0.00DMOG 301 0.27 0.00Dasatinib 117 0.33 0.00Docetaxel 305 0.46 0.00Doxorubicin 330 0.12 0.03EHT 1864 294 0.11 0.06Elesclomol 296 0.28 0.00Embelin 300 0.18 0.00Epothilone B 300 0.23 0.00Erlotinib 346 0.29 0.00Etoposide 323 0.15 0.01FH535 300 0.21 0.00FTI-277 301 0.03 0.60GDC-0449 296 -0.03 0.61GDC0941 291 0.12 0.04GNF-2 107 -0.04 0.71GSK-1904529A 300 0.14 0.01GSK-650394 300 0.23 0.00GSK269962A 106 -0.05 0.64GW 441756 296 0.18 0.00GW843682X 107 0.07 0.44Gefitinib 300 0.26 0.00Gemcitabine 316 0.12 0.04IPA-3 300 0.30 0.00Imatinib 128 0.18 0.04Irinotecan 165 0.13 0.09JNJ-26854165 293 0.34 0.00
Drug Train Size Spearman rSpearman pJNK Inhibitor VIII294 0.11 0.06JNK-9L 301 0.04 0.52JW-7-52-1 107 -0.03 0.77KIN001-135 107 0.09 0.38KU-55933 293 0.20 0.00L-685458 275 0.22 0.00LAQ824 300 -0.03 0.57LBW242 281 -0.04 0.48LFM-A13 300 0.03 0.58Lapatinib 354 0.32 0.00Lenalidomide 297 0.00 0.93MG-132 107 -0.23 0.02MK-2206 275 0.16 0.01MS-275 105 0.12 0.24Methotrexate 304 0.25 0.00Midostaurin 301 0.24 0.00Mitomycin C 316 0.12 0.03NSC-87877 301 0.19 0.00NU-7441 293 0.10 0.08NVP-BEZ235 292 0.14 0.01NVP-TAE684 109 0.20 0.03Nilotinib 383 0.07 0.15Nutlin-3 282 0.23 0.00Nutlin-3a 294 0.33 0.00OSI-906 300 0.25 0.00OSU-03012 300 0.23 0.00Obatoclax Mesylate300 0.35 0.00PAC-1 300 0.16 0.01PD-0325901 401 0.49 0.00PD-0332991 378 0.22 0.00PD-173074 294 -0.06 0.31PF-02341066 109 0.01 0.91PF-4708671 293 -0.03 0.62PF-562271 301 0.23 0.00PF2341066 282 0.24 0.00PHA-665752 348 0.21 0.00PLX4720 398 0.24 0.00Paclitaxel 369 0.17 0.00Panobinostat 278 0.44 0.00
Drug Train Size Spearman rSpearman pParthenolide 106 0.07 0.49Pazopanib 304 0.06 0.27Ponatinib 273 0.14 0.02Pyrimethamine 106 0.07 0.46QS11 300 0.11 0.06RAF265 258 -0.02 0.77RDEA119 292 0.51 0.00RO-3306 294 0.14 0.02Rapamycin 109 0.02 0.85Roscovitine 105 -0.06 0.54S-Trityl-L-cysteine107 0.05 0.62SB 216763 281 0.08 0.17SB590885 282 0.19 0.00SL 0101-1 291 0.00 0.99Salubrinal 106 -0.18 0.07Shikonin 301 0.21 0.00Sirolimus 106 0.05 0.63Sorafenib 353 0.02 0.73Sunitinib 125 0.31 0.00TAE684 282 0.14 0.02TGX221 107 0.11 0.25TKI258 282 0.09 0.13TW 37 293 0.24 0.00Temsirolimus 298 0.27 0.00Thapsigargin 300 0.30 0.00Tipifarnib 300 0.08 0.17Topotecan 295 0.28 0.00VX-680 107 -0.03 0.75VX-702 296 -0.08 0.18Vinblastine 306 0.29 0.00Vinorelbine 312 0.15 0.01Vorinostat 307 0.23 0.00WH-4-023 107 0.18 0.07WZ-1-84 107 0.19 0.05XMD8-85 106 0.36 0.00Z-LLNle-CHO 107 -0.17 0.07ZD-6474 277 0.13 0.04ZM-447439 283 0.02 0.77
Rescaling Breast Cancer dataset
NCI-60 U95 Array Clincal U95 Array
Combat BatchCorrect
NCI-60 U95 Array Clincal U95 ArrayNCI-60 RNA-Seq
Fit Gene Speci�cLinear Model
Apply LinearRescaling
Clincal “RNA-Seq”
Prune genes with correlation p<0.001
ARD Regression
y
λ
~ N (0 , λ In)
w
w ~ N (0 , α )
α
X
y = Xw +
λ1 λ2
p(λ ) = Γ(λ1, λ2)
γ i1 γ i2
p(α i ) = Γ(γ i1, γi2)
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance Determination Regression