antimalarial drug dscovery data disclosure

22
www.guidetopharmacology.org Open and Closed Antimalarial Drug Discovery: Comparing data Connectivity gaps and Disclosure Speed Dr Christopher Southan, Senior Database Curator, IUPHAR/BPS Guide to PHARMACOLGY (GtoPdb), University of Edinburgh BioIT Boston 2016, Wed 6 th ´ April, Track 11, Open Source Innovations 16:30 1 http:// www.slideshare.net/cdsouthan/antimalarial-drug-dscovery-data-di sclosure

Upload: chris-southan

Post on 27-Jan-2017

6.463 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Antimalarial drug dscovery data disclosure

1

www.guidetopharmacology.org

Open and Closed Antimalarial Drug Discovery: Comparing data Connectivity gaps

and Disclosure Speed

Dr Christopher Southan, Senior Database Curator, IUPHAR/BPS Guide to PHARMACOLGY (GtoPdb), University of Edinburgh

BioIT Boston 2016, Wed 6th ´April, Track 11, Open Source Innovations 16:30

http://www.slideshare.net/cdsouthan/antimalarial-drug-dscovery-data-disclosure

Page 2: Antimalarial drug dscovery data disclosure

2

Abstract (will be skipped for presentation)

Antimalarial research is the poster child for Open Source Drug Discovery (OSDD). However many leads compounds still have their origins in Traditional Closed Drug Discovery (TCDD) and uncertainty remains as to the differences. To provide an assessment, this work examined 32 recent antimalarial structures in terms of their PubChem connectivity. Of these, 21 had patent matches, only 23 linked to publications and only 21 had BioAssay records. Major data connectivity problems included 1) leads not findable by code name, 2) patents not cited in publications 3) leads not reciprocally linked to Plasmodium protein targets and pathways 4) name-to-structures only being declared years after patent disclosure. These issues will be contrasted with the Sydney University Open Source Malaria approach were open lab books are used to surface structures (e.g. as Google-findable InChIKey) and crowdsourced collaboration data close to real time, thereby shaving years of the discovery phase.

Page 3: Antimalarial drug dscovery data disclosure

3

Outline

• Introduction to Open Source Drug Discovery (OSDD) • Differences to Traditional Closed Drug Discovery (TCDD) • Extracting antimalarial leads from the literature• Profiling structures in PubChem • A look into the MMV Pathogen Box • Introducing Open Source Malaria (OSM)• Profiling the OSM structure collection• Speed sharing • Google searching InChIKeys• Conclusions• Open structure sets• References and questions please

Page 4: Antimalarial drug dscovery data disclosure

4

Introduction• The OSDD concept is not tied to any particular group• While antimalarials have become a poster-child for OSDD many leads still

come through TCDD route so boundaries between the two are blurred • OSDD has become a test bed (e.g. open data sets from GSK and others,

the Medicines for Malaria Ventures (MMV) “Malaria Box” and WIPO Re:Search IP sharing)

• Sydney Open Source Malaria project (@O_S_M) adheres to OSDD principles (see PMID 23985301)

• I have donated voluntary support to the OSM team since 2012 (i.e. in addition to my Guide to PHARMACOLOGY Senior Database Curator job)

• This has focused on structure searching and data surfacing • I blog on data connectivity in general, and for antimalarials in particular• The surfacing speed for structures reflect “shades of openness” that will be

discussed

Page 5: Antimalarial drug dscovery data disclosure

5

Open vs closed research routes to new medicines

TCDD• Proprietary data • Patent filings • Leads maybe blinded by code

numbers• Papers after patents • No direct submissions to public

databases• Predominantly commercial

software and databases• Typically ~10 years R&D • Still the dominant model

OSDD• Open ELNs• No patent filings • Data surfaced rapidly for sharing• Open access papers• Submissions to public databases• Anyone can contribute• Crowdsourcing• Preference for open source

software and public databases• Potential to shorten research • Pure OSDD relatively rare

Page 6: Antimalarial drug dscovery data disclosure

6

Recent review of leads - but• Link-free zone (except

for references) • PDF “tomb” with

images for structures• No chemical

specifications • No database

identifiers• No target protein

identifiers• DDD107498 was

blinded at that time (no structure)

• I mapped to PubChem CIDs as a community service

Page 7: Antimalarial drug dscovery data disclosure

7

Consequently, much effort was neededto get from this to this

Page 8: Antimalarial drug dscovery data disclosure

8

Getting name-to-structure out of primary papers: not trivial

• On a good day, MeSH curators will index the lead structures specified in PubMed and connect them to PubChem

• On a bad day (as in this case), they may record the name but without a link to a chemical structure

• The code name is still PubChem –ve after a year

Page 9: Antimalarial drug dscovery data disclosure

9

Curatorial ferreting: DDD107498 structure and patent

IUPAC from supp dat > chemicalize.org > PubChem > SureChEMBL > SAR table

Page 10: Antimalarial drug dscovery data disclosure

10

PubChem profile for 32 antimalarial lead structures

http://cdsouthan.blogspot.se/2014/06/getting-into-box-with-some-recent.htmlhttp://cdsouthan.blogspot.se/2015/05/entity-resolution-for-antimalarial.html

22 CIDs collated as Pathogen Box proposals plus 16 structures from the PMID 26000721 review (six in common, see blog posts below)

Page 11: Antimalarial drug dscovery data disclosure

11

Profile for 114 antimalarial actives from the Pathogen Box

http://cdsouthan.blogspot.se/2016/03/a-peek-into-mmv-pathogen-box.html

Page 12: Antimalarial drug dscovery data disclosure

12

With OSM finding stuff is easier

Page 13: Antimalarial drug dscovery data disclosure

13

The entire portfolio is open, including new designs

https://docs.google.com/spreadsheets/d/1Rvy6OiM291d1GN_cyT6eSw_C3lSuJ1jaR7AJa8hgGsc/edit#gid=510297618

http://www.cheminfo.org/flavor/malaria/Display_data.html

411 molecular records in March 2016 OSM master sheet (Mat Todd et al.) and custom ELN (Luc Patiny et al.)

Page 14: Antimalarial drug dscovery data disclosure

14

Rapid triage of the OSM portfolio in PubChem

250 identity matches from 410 InChIs uploaded

Page 15: Antimalarial drug dscovery data disclosure

15

PubChem profile for 250 OSM matches

• Note 160 from 410 had no exact matches (e.g. includes design proposals)• Patents include matches for reference cpds (i.e. not antimalarial claims)

Page 16: Antimalarial drug dscovery data disclosure

16

Speed sharing: OSM > Twitter (bot) > chemicalize.org

Page 17: Antimalarial drug dscovery data disclosure

17

Googling the InChIKey for global findability

• Direct from Open Lab Books

• Or from a chemicalize conversion

• Search in ~0.3 sec • Works with inner

layer• Can cross-check

PubChem < > ELN

Page 18: Antimalarial drug dscovery data disclosure

18

Getting structures into PubChem is not difficult

• As TW2Informatics I deposited MMV670437 in 2013 as a test case• The bioactivity data was later submitted by OSM > ChEMBL > PubChem (but did

not include the code name)• Both SIDs were merged into CID 71819647 , thereby linking name > struc > activity

Page 20: Antimalarial drug dscovery data disclosure

20

Conclusions

• Encouragingly, published output of antimalarial leads is increasing • However, challenges of curating and mapping are similar to those

encountered by the GtoPdb team for human targets and ligands • There is a grey zone between TCDD and OSDD and some leads are

patented• Authors and stakeholders should ensure their SAR is surfaced and

name-to-structure connected in databases (i.e. FAIR principles, see PMID 26978244)

• Gaps persist in mappings between leads, targets and pathways• The practice of OSDD by OSM and collaborators accelerates research• PubChem MyNCBI collections are useful for sharing structure sets

Page 21: Antimalarial drug dscovery data disclosure

21

PubChem MyNCBI open structure sets• 16 clinical candidates from PMID 26000721

• 22 leads from various sources

• 114 from the Pathogen Box

• 250 from the OSM PubChem matches

n.b. Those engaged in antimalarial research can contact me if they need technical details and/or possible generation of new lists (e.g. CID subsets or patent extractions)

http://www.ncbi.nlm.nih.gov/sites/myncbi/christopher.southan.1/collections/48460617/public/

http://www.ncbi.nlm.nih.gov/sites/myncbi/christopher.southan.1/collections/49901772/public

http://www.ncbi.nlm.nih.gov/sites/myncbi/christopher.southan.1/collections/49700347/public/

http://www.ncbi.nlm.nih.gov/sites/myncbi/christopher.southan.1/collections/48358242/public/

Page 22: Antimalarial drug dscovery data disclosure

22

References: questions welcome

http://www.ncbi.nlm.nih.gov/pubmed/23618056

http://www.ncbi.nlm.nih.gov/pubmed/23399051

http://cdsouthan.blogspot.com/ various blog posts on antimalarial ferreting

https://github.com/OpenSourceMalaria OSM on GitHub

http://www.ncbi.nlm.nih.gov/pubmed/23985301

http://www.guidetopharmacology.org/faq.jsp GtoPdb FAQ on curating human bioactives