slicing and dicing expert-curated protein targets in the guide to pharmacolgy

Post on 29-Jul-2015

138 Views

Category:

Science

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

www.guidetopharmacology.org

Slicing and dicing expert-curated protein targets by function, properties and ligands in the Guide to PHARMACOLOGY

Christopher Southan

Presented to AstraZeneca Mölndal, June 17th 2015

2

Outline

• Database context and features• Targets and ligands• Content statistics• Slicing and dicing • Function distributions for targets• Comparing databases• High vs low affinity Genome Ontology (GO) splits• Advanced queries • Ligands in PubChem• Acknowledgments and questions

3

NAR Database Issue 2014

4

GtoPdb hallmarks • Built on the NC-IUPHAR tradition of expertise and quality• Target family Committee model

• ~ 90 subcommittees, ~ 650 expert members, de facto super-curator network

• Wellcome Trust grant objective to cover all human drug targets• Focus on quantitative ligand > protein relationship capture• Stringent and selective curatorial judgments from the team• Extensive use of comments and cross-pointers• Document binding as compound synonyms • All recent ligands have short description• Flexible and responsive coverage • Steady expansion, content stats and regular releases• Unique content including peptides and small molecules• Collaborative (e.g. PubChem, IMGT, Orphanet, HGNC, UniProt)• Downloads and lists• Tables of links in British Journal of Pharmacology reviews• Coupled to bi-annual BJP compendium “The Concise Guide to

PHARMACOLOGY ”• Unique utility positioning w.r.t. other databases

5

Front page

6

Blog coupling

http://cdsouthan.blogspot.se/

7

Target example: BACE1

8

Ligands for BACE1

9

Ligand example

10

Kinase matrix screen

11

Content stats (June 2015, Development version)

Steady growth

13

GtoPdb + UniProt + Venny + Panther

• Extensive drug target “slice and dice” options• Powerful query building• Our cross-references in UniProt regularly updated• Hypothesis-test pharmacologically relevant questions• Either contact us for advanced slicing from GtoPdb that

you can not get directly (e.g. the affinity cuts) or download our lists or even the whole database for DIY

• You can also slice and dice on the ligand chemistry side for example via the PubChem Identifier Exchange Service

14

Function splits between 20,198 human proteins and 978 GtoPdb tractable targets (small-molecule quantitative interactions)

15

Comparative target content via UniProt/neXtProtindexed bioactivity dbs

Human Swiss-Prot intersects

16

Comparing drugged vs tractable

• Human targets, with any ligand = 1371• Human targets, ligand is approved drug = 464• Human targets, ligand has pAct 6-8 (lower affinity) = 532• Human targets, ligand has pAct 9-10 (higher affinity) = 451

17

Difference between drug targets with high vs low affinity

• Analysis shows clear bias towards transporters for low affinity drugs and receptors for high affinity drugs

18

Slicing and dicing in UniProt via target protein IDs

• Database queries that allow intersects (Boolean AND), unions (Boolean

OR) and differentials (Boolean NOT)• E.g. Which UniProtKB/Swiss-Prot entries cite “Southan” (= 34)

• AND where organism is “Homo sapiens” (= 9)

• AND have a cross-reference in Guide to PHARMACOLGY (=3)

• Can query with any extrinsic sets (e.g. from GtoPdb)

19

Properties of tractable targets (via UniProt)

• 978 GtoPdb proteins with small-molecule binding• 85% pathway membership • 53% transmembrane domains• 30% kinases • 18% have a signal peptide • 7% are enzymes for epigenetic chromatin modification• 7% proteases

20

• Which targets have endogenous peptide interactions?• AND exogenous synthetic peptide interactions?• AND synthetic small molecule interactions?• AND are the targets of approved drugs?• Answer = 23 (all intersects and diffs can be

inspected)• Which of these have 3D structures ? = 9

UniProt names & structures

Which pathways are they in ? (Panther )

Advanced pharmacology queries

21

Ligands in PubChem (I)

22

Ligands in PubChem (II)

1. Clin = PubMed clinical trial phase 1, last 5 years, in PubChem

2. MeSH ‘pharmacology’ in GtoPdb

3. PDB structure in GtoPdb

23

UniChem & ChEMBL indexing UniChem 14-JUN-2015 = 90,269,494

24

Specialised data sets

25

Chemistry Connect: Ligands > GtoPdb

26

• Contact established with OP team• Will be integrated in future releases• First stage will be a GtoP <> ChEMBL mapping list• Next stage should be RDF’isation of GtoPdb• We are developing web services

27

Going forward• Expansion < 1500 protein-ligand quant mappings by 3Q2015• We welcome AZ steer on content, features and

handling/integrating our downloads• Wellcome Trust immunopharmacology grant submitted• Open to non-published (but provenanced)data capture• Challenges

• Enhancing ontology incorporation, data retrieval and mining• Resolving structural multiplexing of approved drug structures • Complex targets• Optimising the out-link balance for users• Triaging pharmacological relevance from the genomics data deluge• Hybrid molecule handling including biologics• Mapping historical peptide content into PubChem

28

Acknowledgements and questions

• GtoPdb database team:• Jamie Davies (Principal Investigator)

• Adam Pawson, Helen Benson, Elena Faccenda (Curators)

• Joanna Sharman (Database Developer)

• Veronika Divincova (Project Administrator)

• Past and present members of NC-IUPHAR and its subcommittees, especially Michael Spedding, Steve Alexander, Anthony Davenport and John Peters

• The late Professor Tony Harmar, founder and original PI of GtoPdb

top related