library of integrated network-based cellular signatures (lincs)

47
Library of Integrated Network- based Cellular Signatures (LINCS) September 20, 2013

Upload: abrial

Post on 24-Feb-2016

80 views

Category:

Documents


2 download

DESCRIPTION

Library of Integrated Network-based Cellular Signatures (LINCS). September 20, 2013. LINCS concept. cell types. phenotypic assays. perturbations. perturbations scalable to genome high information content read-outs (e.g. gene expression) inexpensive mechanism to query database. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Library of Integrated Network-based Cellular Signatures (LINCS)

Library of Integrated Network-based Cellular Signatures

(LINCS)

September 20, 2013

Page 2: Library of Integrated Network-based Cellular Signatures (LINCS)

LINCS concept

• perturbations scalable to genome• high information content read-outs (e.g. gene expression)• inexpensive• mechanism to query database

cell types

phenotypic assa

ysperturbations

Page 3: Library of Integrated Network-based Cellular Signatures (LINCS)

Look-up table of cellular activity

perturbations cell types read out

database

GENOME SCALEGENETIC

PHARMACOLOGIC

MODERATECOMPLEXITY

10’S COMPLEX

COMMUNITYQUERIES

PLATFORM-INDEPENDENT

Page 4: Library of Integrated Network-based Cellular Signatures (LINCS)

The LINCS Network (NIH)

Data Production/Analysis CentersBroad InstituteHarvard Medical School

Computational andTechnology Development Centers

Arizona StateBroad Institute (Jake Jaffe)ColumbiaU. CincinnatiMiami School of MedicineWake ForestYale

External Collaborations• Snyder Lab, Sanford-Burnham Medical

Research Institute • FDA• GTEx• ENCODE/Epigenomics• Rao Lab, NIH CRM:• Scadden Lab, Massachusetts General

Hospital• McCray Lab, University of Iowa• Loring Lab, Scripps Research Institute• Edenberg Lab, Indiana University • Spria Lab, Boston University• Pandolfi Lab, BIDMC• Chen Lab, NHLBI• Kotton Lab, Boston University

Page 5: Library of Integrated Network-based Cellular Signatures (LINCS)

diseases genes drugs

mRNA Expression Database

453 Affymetrix profiles164 drugs

> 16,000 users 916 citations

Lamb et al, Science (2006)

Connectivity Map

Page 6: Library of Integrated Network-based Cellular Signatures (LINCS)

CMAP/LINCS is an approach tofunctional annotation

perturbagens cell types

Page 7: Library of Integrated Network-based Cellular Signatures (LINCS)

CMap is limited by profiling cost low-cost, high-throughput method would enable…

primary screening librariesdrug-like, non-drug-like, natural products

genomic perturbagensshRNA, ORF, variants (natural + synthetic)

cellular contextstissues, types, culture conditions, genetics

treatment parametersconcentrations, durations, combinations

• re-think: gene content × labeling × detection

Page 8: Library of Integrated Network-based Cellular Signatures (LINCS)

samples

gene

s

observation: gene expression is correlated

Page 9: Library of Integrated Network-based Cellular Signatures (LINCS)

computational inference model

reduced representation transcriptome

‘landmarks’

genome-wideexpression profile

Reduced Representation of Transcriptome

~ 100,000 profiles

0

20

40

60

80

100

2228

3

1481

2

1000

0

5000

2000

1500

1000 700

500

300

100

number of landmarks measured

% c

onne

ctio

ns

80%

1000

simulation

Page 10: Library of Integrated Network-based Cellular Signatures (LINCS)

1000-plex Luminex bead profiling

001

3' TTTT

5' 3'

5'-PO4 |

5'

5'

5' AAAA 3'

RT

ligation

PCR

hybridization

Luminex Beads (500 colors,

2 genes/color)

Reagent cost: $5/sample

Page 11: Library of Integrated Network-based Cellular Signatures (LINCS)

content

technology

throughput

unit cost(reagent)

1 22,000transcripts

inferredmeasured

1,0001 22,000transcripts

GeneChip L1000

microarray

3× 96 / week

$500

Luminex beads

200× 384 / week

$5

“L1000” expression profiling

Page 12: Library of Integrated Network-based Cellular Signatures (LINCS)

LINCS Dataset

Page 13: Library of Integrated Network-based Cellular Signatures (LINCS)

Current LINCS Dataset

5,178 compounds• 1,300 off-patent FDA-approved drugs• 700 bioactive tool compounds• 2,000+ screening hits (MLPCN +

others)3,712 genes (shRNA + cDNA)

• targets/pathways of FDA-approved drugs (n=900)

• candidate disease genes (n=600)• community nominations (n=500+)

15 cell types• Banked primary cell

types• Cancer cell lines• Primary hTERT-

immortalized• Patient-derived iPS cells• 5 community nominated

small-moleculesgenomic perturbagens

1,000 landmark genes

21,000 inferred genes

1,209,824 profiles

Page 14: Library of Integrated Network-based Cellular Signatures (LINCS)

Coming soon (in beta)

Page 15: Library of Integrated Network-based Cellular Signatures (LINCS)

U54 Grant: Progress on Data Access

desc format availability common use cases

level 1 Raw data Plate folders with 3,812 folders

new computational approaches to data pre-

processing and normalization

level 2

Normalized dataset Matrix: GCTX 1.2M+ profiles deriving signatures

other kinds of analysis

level 3

Signatures(differentially expressed

genes)

1. mongo DB2. Matrix: GCTX 383,788 sigatures

(beta release)

High-level integration with analytics and websites e.g

Genes that are modulated by TP53

Genes most correlated to the Akt1 pathway

level 4 Queries JSON objects Q1 2014 Genes connected to an

external query signature

Page 16: Library of Integrated Network-based Cellular Signatures (LINCS)

findings

1) Large-scale gene-expression analysis

2) Analysis of L1000 shRNA signatures

Page 17: Library of Integrated Network-based Cellular Signatures (LINCS)

# of

pro

files

Page 18: Library of Integrated Network-based Cellular Signatures (LINCS)

Data quality: correlation between biological replicates

Page 19: Library of Integrated Network-based Cellular Signatures (LINCS)
Page 20: Library of Integrated Network-based Cellular Signatures (LINCS)

cum

ulat

ive

scor

e

connected

down-regulated

up-regulated

genes (thousands)

cum

ulat

ive

scor

e

not connected

genes (thousands)

matching cell states

1) define a ‘query’

2) assess strength of the query in the profile of all perturbagens in DB

3) rank order perturbagens by connectivity strength

the set of genes up- and down- regulated in a cellular state of interest

rank perturbagen

123.....

997998999

conn score

10.9930.791

.000.

-0.877-0.945

-1

drug Ydrug egene S…gene ndrug Idrug L…drug Ngene Edrug G

positive connectivity

no connectivity

negative connectivity

Page 21: Library of Integrated Network-based Cellular Signatures (LINCS)

reversing drug resistance

hypothesis:sirolimus induces glucocorticoid sensitivity

sirolimus

50 ‘sensitive’ and 50 ‘resistant’ markers

signature: glucocorticoid resistant acute lymphoblastic leukemia

(David Twomey and Scott Armstrong)

resistant sensitive resistant sensitive

0.8040.7890.544

35-sirolimus42-sirolimus26-sirolimus

56

27

HL60ssMCF7MCF7

cell scorerank perturbagen

464

1

Page 22: Library of Integrated Network-based Cellular Signatures (LINCS)

The 1% challenge:the “tail” of current data is > ENTIRE previous dataset

Page 23: Library of Integrated Network-based Cellular Signatures (LINCS)

query: histone deacetylase inhibitors (Glaser et al 2003)

Rank Compound ID Compound Description Connectivity Score1 BRD-K69840642 ISOX 0.9952 BRD-K52522949 NCH-51 0.9943 BRD-K12867552 THM-I-94 0.9934 BRD-K64606589 apicidin 0.9925 BRD-K56957086 dacinostat 0.996 BRD-A19037878 trichostatin-a 0.9897 BRD-A94377914 merck-ketone 0.9878 BRD-K17743125 belinostat 0.9879 BRD-K75081836 BRD-K75081836 0.986

10 BRD-K81418486 vorinostat 0.98611 BRD-K68202742 trichostatin-a 0.98612 BRD-K22503835 scriptaid 0.98613 BRD-K02130563 panobinostat 0.98514 BRD-A39646320 HC-toxin 0.98315 BRD-K13810148 givinostat 0.9816 BRD-K85493820 KM-00927 0.97717 BRD-K11663430 pyroxamide 0.97718 BRD-K74761218 WT-171 0.97519 BRD-K74733595 APHA-compound-8 0.9720 BRD-A19248578 latrunculin-b 0.96521 BRD-K49010888 BRD-K49010888 0.96222 BRD-K53308430 SA-1017940 0.95123 BRD-K64890080 BI-2536 0.9524 BRD-K00627859 tubastatin-a 0.94725 BRD-K31542390 mycophenolic-acid 0.9460.5% Page 1 / 200

Page 24: Library of Integrated Network-based Cellular Signatures (LINCS)

Rank Compound ID Compound Description Connectivity Score1 BRD-K78659596 MLN2238 0.9982 BRD-K60230970 MG-132 0.9983 BRD-K88510285 bortezomib 0.9964 BRD-A55484088 BNTX 0.9935 BRD-A18725729 BRD-A18725729 0.9936 BRD-K74402642 NSC-632839 0.9927 BRD-K50234570 EMF-bca1-16 0.9928 BRD-A58924247 BRD-A58924247 0.9929 BRD-A39093044 K784-3187 0.992

10 BRD-A72180425 K784-3188 0.99211 BRD-K50691590 bortezomib 0.99212 BRD-K19499941 BRD-K19499941 0.9913 BRD-K09854848 MD-II-008-P 0.98814 BRD-A76490030 K784-3131 0.98815 BRD-A36275421 MW-RAS12 0.98716 BRD-K28366633 BRD-K28366633 0.98717 BRD-A11007541 BCI-hydrochloride 0.98718 BRD-K37392901 NSC-632839 0.98719 BRD-K66884694 BRD-K66884694 0.98720 BRD-A83124583 EMF-sumo1-39 0.98621 BRD-K10882151 BO2-inhibits-RAD51 0.98622 BRD-K44366801 BRD-K44366801 0.98523 BRD-K61033289 15-delta-prostaglandin-j2 0.98524 BRD-K07303502 arachidonyl-trifluoro-methane 0.98425 BRD-K02822062 CT-200783 0.984

query: compound identified to induce the lysosomal apoptosis pathway (D’Arcy et al Nature Medicine 2012)

0.5% Page 1 / 200

Page 25: Library of Integrated Network-based Cellular Signatures (LINCS)

Rank Compound ID Compound Description Connectivity Score1 BRD-A81772229 simvastatin 0.9962 BRD-A70155556 lovastatin 0.9943 BRD-U88459701 atorvastatin 0.9914 BRD-A18763547 BAX-channel-blocker 0.9885 BRD-K22134346 simvastatin 0.9856 BRD-K12994359 valdecoxib 0.9837 BRD-K09416995 lovastatin 0.9818 BRD-K34581968 BMS-536924 0.9799 BRD-K94176593 TWS-119 0.975

10 BRD-K20285085 fostamatinib 0.97311 BRD-K94441233 mevastatin 0.97212 BRD-K95785537 PP-2 0.97113 BRD-K53414658 tivozanib 0.9714 BRD-K83213911 PF-750 0.96815 BRD-K85606544 neratinib 0.96816 BRD-A19248578 latrunculin-b 0.96717 BRD-K68588778 BRD-K68588778 0.96618 BRD-K06750613 GSK-1059615 0.96619 BRD-A11678676 wortmannin 0.96420 BRD-K05653692 DL-PDMP 0.96321 BRD-K72420232 WZ-4002 0.96122 BRD-K19796430 erismodegib 0.96123 BRD-K78513633 lonidamine 0.96124 BRD-K03618428 PP-110 0.96125 BRD-K37940862 BRD-K37940862 0.961

query: HUVEC cells treated with pitavastatin (cell line not in panel)

0.5% Page 1 / 200

Page 26: Library of Integrated Network-based Cellular Signatures (LINCS)

Rank Compound ID Compound Description Connectivity Score1 BRD-K12502280 TG-101348 0.9922 BRD-K94176593 TWS-119 0.9873 BRD-K20285085 fostamatinib 0.9754 BRD-K49328571 dasatinib 0.9695 BRD-K12867552 THM-I-94 0.9696 BRD-K85493820 KM-00927 0.9697 BRD-A02180903 betamethasone 0.9698 BRD-K91701654 U-0126 0.9669 BRD-K95785537 PP-2 0.965

10 BRD-K53414658 tivozanib 0.96411 BRD-A50454580 PD-0325901 0.9612 BRD-K73789395 ZM-336372 0.9613 BRD-K17743125 belinostat 0.95214 BRD-K46419649 U0126 0.9515 BRD-K09499853 KU-0060648 0.94916 BRD-K64890080 BI-2536 0.94717 BRD-K70914287 BIBX-1382 0.94718 BRD-K50168500 canertinib 0.94619 BRD-U43867373 WH-4025 0.94620 BRD-U25771771 WZ-4-145 0.94521 BRD-K34581968 BMS-536924 0.94322 BRD-K18787491 U-0126 0.94223 BRD-K56343971 vemurafenib 0.94124 BRD-K01877528 TL-HRAS-61 0.93725 BRD-K66175015 afatinib 0.933

query: imatinib-resistant chronic myeloid leukemia (Frank et al Leukemia 2006)

0.5% Page 1 / 200

Page 27: Library of Integrated Network-based Cellular Signatures (LINCS)

findings

1) Large-scale gene-expression analysis

2) Analysis of L1000 shRNA signatures

Page 28: Library of Integrated Network-based Cellular Signatures (LINCS)

Current CMap Dataset

Page 29: Library of Integrated Network-based Cellular Signatures (LINCS)

1. Connections b/w genes and drugs2. GWAS gene lists to pathways3. Causal mutation to therapeutic leads4. Discovering new cancer pathways5. MoA of novel small-molecules6. Biological novelty biasing

biological goalLINCS as a starting point for

functional follow-up

Page 30: Library of Integrated Network-based Cellular Signatures (LINCS)

Core Signature DB

263 Components explain 80% of the variance

Core Gene signatures from KD (n=1387)

2226

8 Fe

atur

es

Signature Diversity

Similarity Metric

Mining the Similarity Matrix• Unsupervised

• Global Patterns• Supervised

• Gene->[Gene,Pathway,Compound]

Genes (n=1387)Genes (n=1387)

Page 31: Library of Integrated Network-based Cellular Signatures (LINCS)

Global Views of Connections

49% of genes have at least 1 connection > 0.4

Connections per gene

PC3 cell line

Most connected genes

Page 32: Library of Integrated Network-based Cellular Signatures (LINCS)

• JAK2 knockdown connects to STAT1 signature• FOS knockdown connects to JUN signature• Cell cycle genes connected (CCND1, CDK2, CDK4, CDK6, CCNE1, E2F1)• ER knockdown connected to ER antagonists & inversely connected to ER agonists• JAK2 over-expression signature inversely to JAK2 inhibitor (lestaurtinib)• HDAC knock-downs connected to HDAC inhibitors (vorinostat, others)• NRF2 over-expression signature inversely connected to curcumin• WNT1 gene connections: TCF7L1, GSK3B, CSNK2A2, PRAKACA, SMAD3…

querying LINCS for connections

Page 33: Library of Integrated Network-based Cellular Signatures (LINCS)

AKT3, FOXO1, PDPK1, PHLPP1,

PIK3CB

Top 10 small-molecule connections

genes connections

Integrating queries across members of a pathway

AKT1

Page 34: Library of Integrated Network-based Cellular Signatures (LINCS)

39 genesassociated with T2D

allele classification• genes implicated by GWAS

– can be many hundreds, most unannotated• create profiles of ablation (shRNA) in

suitable cells by L1000– universal functional bioassay

• cluster into “complementation groups”– assign genes to groups, groups to pathways,

pathways to disease

S. Jacobs &D. Altshuler

Page 35: Library of Integrated Network-based Cellular Signatures (LINCS)

Drug signature in

MCF7

All MCF7 CGS

wtcs score rank

Similar Dissimilar

Query

Molecular target of Drug A

Target ID

Page 36: Library of Integrated Network-based Cellular Signatures (LINCS)

An Example where integrating across many shRNAs improves Connections

Each dot is a dose / timepoint of rapamycin

MTOR shRNA 1MTOR shRNA 2MTOR shRNA 3MTOR shRNA 4MTOR shRNA 5MTOR shRNA 6MTOR shRNA 7MTOR shRNA 8MTOR shRNA 9MTOR shRNA 10MTOR shRNA 11MTOR shRNA 12MTOR shRNA 13

MTOR Consensus Gene Signature

Connectivity Rank of Small Molecules500040003000200010001

Page 37: Library of Integrated Network-based Cellular Signatures (LINCS)

Query with Vemerafinib, highlight BRAF shRNAs

Cel

l lin

e

Each dot is an individual shRNA targeting BRAF

Rank of shRNA (%) Negative Correlation

Positive Correlation

Page 38: Library of Integrated Network-based Cellular Signatures (LINCS)

MTOR connects to BEZ235

Page 39: Library of Integrated Network-based Cellular Signatures (LINCS)

Rank CGS ID Gene Symbol Connectivity Score1 CGS001-2475 MTOR 0.9992 CGS001-4609 MYC 0.993 CGS001-57521 RPTOR 0.9764 CGS001-2623 GATA1 0.9725 CGS001-5245 PHB 0.9696 CGS001-2581 GALC 0.9677 CGS001-9184 BUB3 0.9658 CGS001-360023 ZBTB41 0.9659 CGS001-4860 PNP 0.96510 CGS001-11164 NUDT5 0.96411 CGS001-89849 ATG16L2 0.96412 CGS001-527 ATP6V0C 0.96413 CGS001-2065 ERBB3 0.96114 CGS001-3845 KRAS 0.95415 CGS001-4486 MST1R 0.95416 CGS001-3479 IGF1 0.95117 CGS001-207 AKT1 0.9518 CGS001-8607 RUVBL1 0.94819 CGS001-54106 TLR9 0.94820 CGS001-5045 FURIN 0.94725 CGS001-9533 POLR1C 0.944

Rank Compound ID Compound Description Connectivity Score1 BRD-K12184916 NVP-BEZ235 12 BRD-K69932463 AZD-8055 13 BRD-K67566344 KU-0063794 14 BRD-K67868012 PI-103 0.9995 BRD-K77008974 WYE-354 0.9986 BRD-K94294671 OSI-027 0.9987 BRD-A45498368 WYE-125132 0.9988 BRD-K13049116 BMS-754807 0.9979 BRD-K87343924 wortmannin 0.99610 BRD-K67075780 TGX-115 0.996

Page 40: Library of Integrated Network-based Cellular Signatures (LINCS)

BEZ235: a dual ATP-competitive PI3K and mTOR inhibitor

Dose dependentconnectivity

Page 41: Library of Integrated Network-based Cellular Signatures (LINCS)

PIK3CA connects to BEZ235

Page 42: Library of Integrated Network-based Cellular Signatures (LINCS)

Current list of significant drug-CGS connectivities span multiple MoA’s

losartan AGTR1 Merck60 HDAC1 TGX-115 PIK3C2AMK-2206 AKT1 ISOX HDAC6 BEZ235 PIK3CA10-DEBC AKT1 2-bromopyruvate HK1 PIK-90 PIK3CAMK-2206 AKT2 lovastatin acid HMGCR Compound 110 PIK3CAMK-2206 AKT3 linsitinib IGF1R GW-843682X PLK110-DEBC AKT3 selumetinib MAP2K1 LFM-A13 PLK1brefeldin A ARF1 Compound 11e MAPK1 HA-1004 PRKACBgossypol BCL2 sirolimus MTOR KU 0060648 PRKDCYM-155 BIRC5 BEZ235 MTOR AM-580 RARAZM336372 BRAF PIK-90 MTOR gemcitabine RRM1LFM-A13 BTK PP-30 MTOR fatostatin SREBF2N9-isopropylolomoucine CDK1 parthenolide NFKB1 RITA TP53BML-259 CDK2 triptolide NFKB2 nutlin-3 TP53fumonisin B1 CERS4 dexamethasone NR3C1 pifithrin-alpha TP53etomoxir CPT1A olaparib PARP1 SJ-172550 TP53PNU-74654 CTNNB1 olaparib PARP2 gemcitabine TYMScyanoquinoline 11 EGFR veliparib PARP2 MK 1775 WEE1neratinib EGFR GSK-2334470 PDK1 tyrphostin AG-1478 EGFR BX-795 PDK1

AZD-7545 PDK2 tamoxifen ESR1 PF-3845 FAAH

Page 43: Library of Integrated Network-based Cellular Signatures (LINCS)

Goal: Given a chemical library:

• identify the bioactive subset of a library• identify unique bioactivity

Gene-expression as a universal measure of bioactivity

If we see no robust gene expression consequence whatsoever across a diverse panel of cell types, then it's likely that the

compound has no bioactivity.

Page 44: Library of Integrated Network-based Cellular Signatures (LINCS)

L1000 as a sensor of bioactivity

active analogs(high S-C)

inactive analogs(low S-C)

dose titration

signature robustness across replicates (C)

S-C plot

signa

ture

stre

ngth

(S)

Page 45: Library of Integrated Network-based Cellular Signatures (LINCS)

biological novelty biasing of chemical libraries

reproducibility

sign

al s

tren

gth

0 1-1

20

6

0

• global bioactivity detection using L1000 profiles– number and magnitude of expression changes, and robustness

• calibrate with 350 known bioactives across 47 cell lines– median sensitivity of individual cell lines is 42% (90% specificity)– rationally-designed panel of 7 cell lines achieves 95% sensitivity

• qualification, de-duplication, and novelty biasing– consolidate and subset libraries based on function

chemical libraryn = 9,875

activen = 487 (5%)

known MoAn = 435 (4.5%)

noveln = 52 (0.5%)

de-duplicatedn = 30 (0.3%)

Page 46: Library of Integrated Network-based Cellular Signatures (LINCS)

1. Data Generation: 1.2M+ profiles released to LINCS

2. Data Access: Multiple levels of data matrices, cloud-compute beta released

3. Biologist-friendly web user interfaces

4. Emerging scientific findings1. Causal mutation to therapeutic leads2. GWAS gene lists to pathways3. Discovering new cancer pathways4. Connecting small-molecules to biology5. Biological novelty biasing of chemical

libraries

Broad LINCS U54

Page 47: Library of Integrated Network-based Cellular Signatures (LINCS)

CMap Analytical

Rajiv NarayanJoshua GouldCorey FlynnTed NatoliDavid WaddenIan SmithRoger HuLarson HogstromPeyton Greenside

CMap Data Generation

David PeckJohn DavisRoger CornellXiaohua WuXiaodong LuMelanie Donahue

Todd Golub

Broad ScientistsJesse BoehmBang WongFederica PiccioniJohn DoenchDavid RootSuzanne JacobsPaul ClemonsStuart SchreiberAly Shamji

Broad Platforms

RNAi platformChemical BiologyTD/TS