correct drug structures for pharmacology

15
How can pharmacologists know which drug structures are correct? Christopher Southan, Elena Faccenda, Simon J. Harding, Joanna L. Sharman, Adam J. Pawson, and Jamie A Davies IUPHAR/BPS Guide to Pharmacology (GtoPdb) University of Edinburgh, Centre for Integrated Physiology, EH8 9XD, UK. Presentation for BPS | Pharmacology 2016, London Scheduled for Wed, Dec14, 2:15 PM 1 http:// www.slideshare.net/cdsouthan/correct-drug-structures-fo r-pharmacology

Upload: chris-southan

Post on 27-Jan-2017

157 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Correct drug structures for pharmacology

1

How can pharmacologists know which drug structures are correct?

Christopher Southan, Elena Faccenda, Simon J. Harding, Joanna L. Sharman, Adam J. Pawson, and Jamie A Davies

IUPHAR/BPS Guide to Pharmacology (GtoPdb) University of Edinburgh, Centre for Integrated Physiology, EH8 9XD, UK.

Presentation for BPS | Pharmacology 2016, LondonScheduled for Wed, Dec14, 2:15 PM

http://www.slideshare.net/cdsouthan/correct-drug-structures-for-pharmacology

Page 2: Correct drug structures for pharmacology

2

Abstract (will not be shown, should be online at BPS)

Introduction: Human medicines represent the crown jewels of pharmacology. Paradoxically however; there is neither any “Gold Standard” set of approved chemical structures, nor agreement on totals. A 2009 comparison of three sets of approved drugs recorded only 807 exact structures-in-common from the expected ~1200 [1]. The IUPHAR/BPS Guide to Pharmacology (GtoPdb) team have grappled with this discordance issue for curating approved drugs and all ~ 6000 small-molecule ligands we deposit into PubChem [2]. Users have the same challenge of deciding correct structures when procuring compounds for experiments or navigating links between journals and databases. This work examines the problems and partial solutions.Methods: We used PubChem to explore relationships for selected drugs already curated into GtoPdb. Tools included the “same connectivity” operator that records distinct compound record (CID) representations of the same carbon backbone. We divided structural multiplexing causes between stereo differences, mixtures and isotopic derivatives. We then performed Venn-type comparisons between DrugBank, ChEMBL, and the Therapeutic Target Database. Additional metrics were generated to dissect contributing factors to discordance between these three and other sources.Results: Atorvastatin has 51 different single representations in PubChem and 248 mixtures with paclitaxel (taxol) having 142 and 330, respectively. Comparing three manually curated drug sets mentioned above inside PubChem showed the consensus was only 25% of the sum. Results comparing other drug sources also showed discordance. Causes for CID multiplexing discordance will be presented. Using PubChem tools we assessed a curation strategy of selecting CIDs with structures supported by the majority of submitting sources. While not infallible, comparison with INN documentation indicated its effectiveness. We will also show how tagging our own approved drug records facilitates easy retrieval of just these entries from PubChem but that vendor drug names sometimes mapped to different structures.Conclusion: As PubChem pushes towards 100 million, we have examined problems of choosing correct structures of pharmacologically active compounds. The constitutive challenges of chemical representation and high levels of discordances we recorded indicate that definitive drug lists (even our own) will remain elusive until pharmaceutical companies submit their own records directly to open databases. In the meantime, we have optimised our GtoPdb curation for the submission of our own 1088 approved CID entries as both a partial solution and trusted reference set for the pharmacology community.References: [1] Southan et al. (2009) J Cheminform. 1:1-10. [2] Southan et al. (2016). Nucl. Acids Res. 44 (Database Issue): D1054-68.

Page 3: Correct drug structures for pharmacology

3

Outline

• Introduction to GtoPdb• Context of the study• Database chemistry and approved drug counts• Intersecting curated drugs in PubChem• Fuzzy drug structure relationships• GtoPdb approved drugs• GtoPdb structures in PubChem• Conclusions• References

Page 4: Correct drug structures for pharmacology

4

Introduction to IUPHAR/BPS Guide to Pharmacology (GtoPdb)

• IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British Pharmacological Society

• Formerly know as IUPHAR-DB for receptors and channels since 2009• Since 2012 funded by Wellcome Trust to cover all targets in the

human genome• Curated molecular mechanism of action (mmoa) as quantitative

activity mapping to primary targets, including IUPHAR nomenclature• 1429 human proteins, 14701 interactions, 8674 ligands• Described in four Nucleic Acids Research Annual Database issues,

PMIDs 26464438 (2016), 24234439 (2014), 23087376 (2013) and 21087994 (2011)

• Distilled into bi-annual British Journal of Pharmacology “Concise Guide to PHARMACOLOGY” as a nine-paper series

• Presents users with the best compounds for pharmacology research in silico, in vitro, in cellulo, in vivo, or in clinico

http://www.guidetopharmacology.org/

Page 5: Correct drug structures for pharmacology

5

Context of presentation

• In the last few years the GtoPdb team has been finding structure space around lead compounds, probes and drugs increasingly “fuzzy”

• Curatorial choices are consequently becoming more difficult• We needed a molecular perspective on the causes of this “fuzz”• We have increased our exploration of PubChem chemical

structural neighbourhoods to gain this perspective• This presentation distils key points

Page 6: Correct drug structures for pharmacology

6

There’s a lot of chemistry out there

Source Count

UniChem EBI 138 million

CAS/SciFinder 124 million

PubChem 93 million

PubChem (vendors) 64 million

ChemSpider 58 million

SureChEMBL (patents) 17 million

ChEMBL 1.6 million

PubChemBioAssay (active) 1.0 million

MeSH Pharmacological action 14,879

PubChem INN or USAN 10,858

Preclinical 6,861*

Phase I 1,856*

Phase II 2261*

Phase III 954*

Guide to PHARMACOLOGY 6,565

November 2016 counts

* The Citeline© 2015 drug counts include average of ~25% biologicals

Page 7: Correct drug structures for pharmacology

7

Approved drug structure counts: take your pick

Source Year Total Reference NotesGVKBIO Drug Database 2013 4750Slideshare Global approvedNCATS Pharmaceutical Collection 2011 2356PMID 21525397 FDA, from global 3936Therapeutic Target Database 2015 2071PMID 26578601 Small-molecule FDADrugCentral 2016 2021PMID 27789690 FDA, from 4456 APIsDrugBank 5.0 2016 2004PMID 24203711 App. small-molecule, from 2225ChEMBL 22 2016 1855PMID 24214965 SMILES from 2260 Phase 4Drug3D db 2015 1790PMID 22539672 Small-molecule FDACfam Chemical Families db 2015 1691PMID 25414339 ApprovedMap of molecular drug targets 2016 1578PMID 27910877 FDA approvedFDA approved NME overview 2013 1543PMID 24680947 Small-molecule FDA, no strucs.Network analysis of FDA drugs 2007 1471PMID 17516560 26th Orange Book, no strucs.SWEETLEAD db 2013 1427PMID 24223973 FDA, from global 2836FDA recommended dose db 2004 1309PMID 15546675 Small-molecule FDAGuide to PHARMACOLOGY 2016.4 2016 1291PMID 26464438 Approved, selective curation

Page 8: Correct drug structures for pharmacology

8

Discordance of curated drug sets within PubChem

http://www.slideshare.net/cdsouthan/will-the-correct-drugs-please-stand-up-68239021

• Good news: 1361 structures with at least 3-way agreement • Bad news: no“Gold Standard” set (but the 459 4-way would do)• Details below

NPC = National Centre for Advancing Translational Sciences (NCATS) Pharmaceutical Collection

Page 9: Correct drug structures for pharmacology

9

Exploring “fuzz” via PubChem:Which of 51 atorvastatins is correct?

• Powerful structural relationship navigation

• Needs cheminformatics expertise

Page 10: Correct drug structures for pharmacology

10

Which of 145 taxols is correct?

145 distinct structures in PubChem 12 have BioAssay results34 have vendors

Page 11: Correct drug structures for pharmacology

11

GtoPdb approved drug curation

• Our approach is stringent and parsimonious (i.e. not a pharmacopeia)

• Usually select the best-supported PubChem CID • We “fuzz” check for chirality, strip salts and cross-check INN

PDFs• Focus on human diseases • No inorganics (except Li), nutraceuticals or metabolites• Mainly FDA and EMA• Withdrawn or discontinued are flagged• Cross-pointers to approved salt forms, active metabolites, drug

> prodrug• Every entry has curator’s note• Grateful for feedback and corrections

Page 12: Correct drug structures for pharmacology

12

GtoPdb drugs

• The PubChem query (approved[comment] AND "IUPHAR/BPS Guide to PHARMACOLOGY"[SourceName]) retrieves just our 1291 substances (SIDs)

• These convert to 1174 distinct compound entries (CIDs)• 96% vendor matches in PubChem• The 117 SID difference is mainly antibodies

Approved set now a clean PubChem select

Page 13: Correct drug structures for pharmacology

13

GtoPdb curated small-molecules: overlaps in PubChem

Page 14: Correct drug structures for pharmacology

14

Conclusions

• Chemistry database coverage and annotation depth has expanded• But so has the “fuzz”• Ligand choices for pharmacology experiments can be challenging• Controlling these factors is crucial for experimental reproducibility• GtoPdb is a good “first-stop-shop” choice• “Gold Standard” is illusory but we do our best to select the correct

structures• Feedback welcome on coverage gaps or structural equivocality• We can assist with complex choices• Explore PubChem as “second-stop-shop” • Get acquainted with medicinal chemists and/or cheminformaticians

Page 15: Correct drug structures for pharmacology

15

Thank you; questions welcome

Find out more at the BPS stand

PMID: 26464438, PMCID: PMC4702778