pharos shining light on the druggable genome

PharosShining Light on the Druggable Genome

Dac Trung Nguyen, Timothy Sheils, Geetha Mandava, Ajit Jadhav, Noel Southall, Rajarshi Guha

NCATS, NIH

2016 ACS Fall Meeting, Philadelphia

The interface to the KMCEntity browsing (filterable & linked)Search (full text, auto-suggest)

Detailed view of entities Built on top of a robust REST API

Target Audience

Biologists & Clinical

Researcher• Characterize &

validate novel targets

• Identify key small molecules or biologics

Informatics Scientists

• Data mining• Support target

validation projects

Program Staff

• Explore the research landscape

• New directions for research & funding

Infrastructure

• Built using industry standard tools• Open Source, straightforward to run locally• Sources at https://spotlite.nih.gov/ncats/pharos

https://spotlite.nih.gov/ncats/pharos

https://spotlite.nih.gov/ncats/pharos

What’s Included?

• Pharos presents data from a variety of sources, integrated by U. New Mexico

• Primary focus is the protein target• Target related data include– Identifiers, ontology terms, sequence, expression

data, publications (curated & text mined)• Wherever possible, targets are linked to other

entities– Small molecules, Diseases, Publications

The Data Sources

Antibodypedia.com, BioPlex, Druggable Epigenome Domains, DrugCentral, Ensembl Cross References, GO Consortium, GTEx, GWAS Catalog, HGNC, HPA, HPM, IMPC, AnimalTFDB, JAX/MGI, Panther, PubChem, PubMed, NCBI Gene, NIH RePORTER, OMIM, TIN-X, UniProt, Harmonizome, DISEASES, TISSUES, DTO, CHEMBL

Interactions inside & outside the IDG

Drug Target Ontology

• Employed as a navigation tool as well as a filtering tool

• Currently DTOterms are used aslabels

• Exploring noveluses of thehierarchy

Target Ranking in PubMed

Novelty measures the scarcity of publications about a target: How much was published about it, as the inverse of the sum of FRACTIONS of papers/patents– E.g.: Target A is mentioned in 2 papers, first with other 4

targets, second with other 9 targets Novelty = 1/(1/5 + 1/10) = 3.33

Importance measures the strength of the associations betwee a target and a disease: Fractional disease-target score

– FDT = 1/ (nr targets + nr diseases) for each paper– Bayesian smoothing is used to compare general terms (cancer)

with specific ones (ovarian carcinosarcoma)

C Bologa, D. Cannon et al. 5/14/15 revision

C Bologa, D Cannon et al.

KNOWLEDGE

VALIDATION

TIN-X newdrugtargets.org

http://newdrugtargets.org/

Harmonizome

Ma’ayan et al. Trends Pharmacol Sci. 2014 Sep;35(9):450-60.http://amp.pharm.mssm.edu/Harmonizome/

http://amp.pharm.mssm.edu/Harmonizome/

Harmonogram (Tclin, Kinase)

Harmonogram (Tdark, GPCR)

Compute target similarity in “data availability space”

Tdark targets whose most similar target is not Tdark

Different Ways to Use Pharos

Precomputation converts analysis in to browsing

Supporting Both Types of Users

• Efficient full text search, coupled to relevant auto-suggestion– Primary entry point when exploring

and for hypothesis generation• Extensive list of facets– Supports easy construction of

complex filtering rules• Extensive details for each

target– Linked to external and internal

resources

Entity Dossier

• As you explore the knowledge base it’s useful keep track of data

• Pharos implements a dossier function– Analogous to e-commerce shopping carts

• Support for task-specific dossiers• Download a dossier as a ZIP file

Entity Dossier

Visualizations

• Interactive dashboard– Use visualizations as filters

• Inline visualizations for summary– Radar charts, word clouds, heatmaps, …– Context dependent drill down

• Links to external visualization resources– MSSM harmonogram– TINX (linkout & reduced version incorporated

locally)

Visualization Dashboard

• Different facets visualized appropriately• Directly filter results from visualization

Summary Visualizations

• Summarize text mined publications using word clouds, but also provide access to list


• Consensus gene expression across three datasets (GTEx, HPA & HPM)

Original figure from Christian Stolte


• Quickly scan targets that have similar types of data associated with them

Summary Visualizations - Drilldown

Facet Visualization

Pharos Usage

Pharos Indexing

The Long Term Vision• Provide access to all known

data about targets– Multi-scale, multi-domain –

bioactivity to symptoms• Intelligent summarization– Use explicit links & computational

inference to generate natural languagesummary using all known data

– Influenced by the query• The result is a biological dashboard,

customized for the user and the query

Feedback

• Explore the UI, try it, break it, and let us know what works and what doesn’t

• Are there data types and relations that would help you but are not available?

http://pharos.nih.gov

[email protected]

http://pharos.nih.gov/

http://pharos.nih.gov/

Acknowledgements

• Steve Mathias, Oleg Ursu, Jeremy Yang, Jayme Holmes, Christian Bologa, Daniel Canon, Tudor Oprea

• Stephan Schurer, Lars Juhl Jensen• Nicholas Fernandez, Andrew Rouillard, Avi

Mayan• Tomita Lab, Mike McManus, Gaia Skibinski• Ajay Pillai, Aaron Pawlyk, Christine Colvis

pharos shining light on the druggable genome

Technology