the art of connectivity mapping - acm-bcb.org€¦ · the art of connectivity mapping avi...
TRANSCRIPT
The Art of Connectivity Mapping
Avi Ma’ayan, PhDProfessor, Department of Pharmacological Sciences
Director, Mount Sinai Center for BioinformaticsIcahn School of Medicine at Mount Sinai, New York, NY
2011
$$$$$+
Data+
Statistics+
Wiz Kid
$$$$$+
Expert Knowledge
https://www.datanami.com
The Variety of Sources for Mammalian Molecular Data Collected from Cells, Tissues, Model Organisms and Patients
Genes,
Proteins,
Targets
Cells,
Tissues
Diseases,
Phenotypes,
Adverse
Events
Gene-Sets,
Modules,
Pathways
Drugs,
Small
molecules
Ma’ayan et al. Trends Pharmacol Sci. 2014 Sep;35(9):450-60.
The Harmonizome Project
http://amp.pharm.mssm.edu/Harmonizome/Rouillard et al. Database (Oxford). 2016 Jul 3;2016.
Harmonizome Google Analytics
Enrichr: a search
engine for gene sets…
Enrichr Google Analytics
Transcriptomics Proteomics Pathways
Genes
?
Knockout
Phenotypes
Observed
+
Perturbations
+f (T, P, I)
Tissues
DT
P_
B_3
SU
N.U
RS
_B
_3
SU
N.L
OP
_B
_3
BO
S_
B_3
DO
M_
B_3
LA
P_
B_3
NIL
_D
_4
CA
B_
E_4
CR
I_A
_4
DA
S_
D_4
TO
F_
D_4
ER
L_
D_4
PO
N_
D_4
IMA
_D
_4
SO
R_
D_4
VE
M_
D_4
GE
F_
D_4
SU
N_
D_4
LA
P.U
RS
_B
_3
SO
R.U
RS
_B
_3
VA
N_
B_3
DA
B_
E_4
DA
B_
D_4
DA
B_
A_4
VE
M_
E_4
CY
C_
B_4
IMA
_B
_4
CA
B_
B_4
VA
N_
A_3
TR
S.D
OM
_B
_3
AX
I_B
_4
SU
N.U
RS
_E
_3
MIL
_B
_2
ER
L_
B_4
CA
B_
D_4
VE
M_
B_4
AM
I_A
_2
DA
S.M
TX
_A
_3
DA
S.C
YT
_A
_3
MIL
_E
_3
TO
F_
B_3
FL
E_
E_3
VA
N_
E_4
LA
P_
D_3
DO
M_
A_1
AFA
_A
_3
UR
S_
D_3
TR
S.L
OP
_A
_1
ER
L_
A_3
TR
S.U
RS
_A
_1
MT
X_
A_3
CY
T_
A_3
Gene Expression Single−Organism Cellular Localization Protein Localization To Organelle Cellular Component Disassembly Nuclear−Transcribed Mrna Catabolic Process Translation Protein Targeting To Membrane Protein Localization To Endoplasmic Reticulum Establishment Of Protein Localization To Endoplasmic Reticulum Srp−Dependent Cotranslational Protein Targeting To Membrane Protein Targeting To Er Cotranslational Protein Targeting To Membrane Establishment Of Protein Localization To Organelle Protein Localization To Membrane Protein Targeting Establishment Of Protein Localization To Membrane Nuclear−Transcribed Mrna Catabolic Process, Nonsense−Mediated Decay Protein Complex Disassembly Cellular Protein Complex Disassembly Viral Life Cycle Translational Elongation Translational Termination Viral Transcription
0 100 200 300 400
Value
00
.006
De
nsity
DT
P_
B_3
SU
N.U
RS
_B
_3
SU
N.L
OP
_B
_3
BO
S_
B_3
DO
M_
B_3
LA
P_
B_3
NIL
_D
_4
CA
B_
E_4
CR
I_A
_4
DA
S_
D_4
TO
F_
D_4
ER
L_
D_4
PO
N_
D_4
IMA
_D
_4
SO
R_
D_4
VE
M_
D_4
GE
F_
D_4
SU
N_
D_4
LA
P.U
RS
_B
_3
SO
R.U
RS
_B
_3
VA
N_
B_3
DA
B_
E_4
DA
B_
D_4
DA
B_
A_4
VE
M_
E_4
CY
C_
B_4
IMA
_B
_4
CA
B_
B_4
VA
N_
A_3
TR
S.D
OM
_B
_3
AX
I_B
_4
SU
N.U
RS
_E
_3
MIL
_B
_2
ER
L_
B_4
CA
B_
D_4
VE
M_
B_4
AM
I_A
_2
DA
S.M
TX
_A
_3
DA
S.C
YT
_A
_3
MIL
_E
_3
TO
F_
B_3
FL
E_
E_3
VA
N_
E_4
LA
P_
D_3
DO
M_
A_1
AFA
_A
_3
UR
S_
D_3
TR
S.L
OP
_A
_1
ER
L_
A_3
TR
S.U
RS
_A
_1
MT
X_
A_3
CY
T_
A_3
Gene Expression Single−Organism Cellular Localization Protein Localization To Organelle Cellular Component Disassembly Nuclear−Transcribed Mrna Catabolic Process Translation Protein Targeting To Membrane Protein Localization To Endoplasmic Reticulum Establishment Of Protein Localization To Endoplasmic Reticulum Srp−Dependent Cotranslational Protein Targeting To Membrane Protein Targeting To Er Cotranslational Protein Targeting To Membrane Establishment Of Protein Localization To Organelle Protein Localization To Membrane Protein Targeting Establishment Of Protein Localization To Membrane Nuclear−Transcribed Mrna Catabolic Process, Nonsense−Mediated Decay Protein Complex Disassembly Cellular Protein Complex Disassembly Viral Life Cycle Translational Elongation Translational Termination Viral Transcription
0100200300400
Value
00.0
06
De
nsity
Not
Observed
Unknown
[ n phenotypes ]
Machine Learning is the Obvious Next Step
Deep Learning(ANN with many layers)
Predicting Side Effects with LINCS L1000 Data
Wang Z, Clark NR, Ma'ayan A. Bioinformatics. 2016 Aug 1;32(15):2338-45.
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
3,713 drugs/compounds63 cell lines
3 time points51 dosages
17,041 significant drug induced signatures
http://amp.pharm.mssm.edu/L1000FWD
Fireworks Visualization of
17,041 Drug-Induced
Signatures
http://amp.pharm.mssm.edu/L1000FWD
http://amp.pharm.mssm.edu/L1000FWD
http://amp.pharm.mssm.edu/L1000FWD
http://amp.pharm.mssm.edu/L1000FWD
http://amp.pharm.mssm.edu/L1000FWD
Predicting Patient Age from Vital Signs and Lab Tests with Deep Learning
J Biomed Inform. 2017 Dec;76:59-68.
Network Analysis in Systems Biology Course on Coursera
https://class.coursera.org/netsysbio-002
GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions
Bioinformatics. 2015 Sep 15;31(18):3060-2.
GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions
Bioinformatics. 2015 Sep 15;31(18):3060-2.
GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions
Bioinformatics. 2015 Sep 15;31(18):3060-2.
Microtask Signature Extraction Project: Using GEO2Enrichr to Extract and
Annotate Signatures from GEO
- Single Gene Perturbations (n=2225)- Drug/Toxin Perturbations (n=1319)- Disease vs. Normal Signatures (n=1105)
Nature Communications 7, 12846 (2016)
http://amp.pharm.mssm.edu/CREEDS Nature Communications 7, 12846 (2016)
http://archs4.cloud
ARCHS4: >300K Processed RNA-seq Samples from GEO for Human and Mouse
Nat Commun. 2018 Apr 10;9(1):1366.
ARCHS4: >177K Processed RNA-seq Samples from GEO for Human and Mouse
http://archs4.cloud Nat Commun. 2018 Apr 10;9(1):1366.
ARCHS4 Chrome Extension
BioJupies Chrome Extension
BioJupies
Summary•Data abstraction for data integration• The Harmonizome and Enrichr resources•Next step is Machine Learning• L1000 data processing and visualization•Predicting MOA for L1000 small molecules•Predicting the predictability of novel small molecules• Integration with EMR based on drugs and vital signs•Predicting gene function with ARCHS4•ARCHS4 and BioJupies to facilitate data reuse, data
integration, and signature extraction
NIH Support U54-HL127624 (LINCS-DCIC), U24-CA224260 (IDG-KMC), and OT3-OD025467 (NIH Data Commons)