comparison of compounds-to-targets between databases
DESCRIPTION
Bio-IT_2011TRANSCRIPT
![Page 1: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/1.jpg)
[1]
Comparison of Compound-to-Target Relationships in Chemogenomic and
Drug Databases
Aprill 2012 update: FYI these two blog posts are on the same theme
Chris Southan
ChrisDS Consulting, Göteborg, Sweden,
Presented to the NCBI PubChem team on 11 April. the BioIT World Chemogenomics and Toxicogenomics Workshop on 12 April Boston, USA, and
as a shorter version, the ChEMBL users meeting at the EBI, 27 may 2011
http://cdsouthan.blogspot.se/2012/01/our-human-beta-lactamase-is-not_09.html
http://cdsouthan.blogspot.se/2011/08/compound-to-target-mappings-part-i.html
![Page 2: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/2.jpg)
[2]
Aknowledgments and Context
• I profoundly appreciate the efforts of those who develop, manage and maintain public resources specified here and many others I enjoy acessing
• I have some history in evaluating the utility, exploitation and content quality of both bioinformatics and cheminformatics databases. I thus enjoy the dual roles (roughly in equal parts) of both fan and critic
• All databases have imperfections. This presentation investigates a selection of these but critical analysis should not be missinterpreted as disparaging either the quality of primary sources or the work of curators and database teams
![Page 3: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/3.jpg)
[3]
Outline
• Mapping concepts sources and challenges• Extremes of the distribution• Atorvastatin, drug-to-targets • Hmg-CoA reductase target-to-drugs• Equivocal mapping examples• Exploring data intersects• Complex targets• Conclusions and outlook
![Page 4: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/4.jpg)
[4]
Activity-to-compound-to-protein Mapping:Capturing Relationships Between four Concepts
MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYVVFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLLK
Document Assay Result Compound Protein
Unstructured data Structured data
Expert extraction and curation
Papers & Patents Databases
![Page 5: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/5.jpg)
[5]
The D-A-R-C-P Axis
Pathway/module/system
![Page 6: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/6.jpg)
[6]
Compound and drug-to-target Collations
Targets = 5,662 protein targets, cpds = 284,206 data points = 648,915,
Targets = 8,091 Small Molecules = 658,075, data points = 3,030,317
BioAssays extracted from literature (ChEMBL) = 499,520, Direct screening assays = 3,208, active Compounds = 23,677, Targets = 447
Approved cpds = 1431 , Targets = 1458,
Experimental cpds = 5212, research targets = 3206
Targets = 358 successful, 251 clinical trial and 1,254 research,
Drugs = 1,511 approved, 1,118 clinical trial and 2,331 experimental
D-C-P-S
D-A-R-C-P
D-A-R-C-P
(D)-A-R-C-P-S
D-C-P
![Page 7: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/7.jpg)
[7]
PDB Drug-to-Protein
Mappingsin DrugPort
![Page 8: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/8.jpg)
[8]
Target Mapping: Curatorial Challenges
• Target = (infered) direct binding• Primary (bona fide) target = therapeutic causality• Polytargets = multiple• Para-target = sub-family specificity • Ortho-target = cross-species specificity• Cross-screen = non-homologous• Non-target (e.g. trypsin, albumin)• Off-target = liability (ADR or side effect) • Anti-target = known libaility (e.g. HERG)• Indirect target = non-binding (e.g. APP)• Complex = resolvable to sequence IDs (eg proteosome)• Complex = experimentaly unresolved (e.g. PDE5s)• Ambigous = lack of metadata or curatorial judgment (e.g. BACE)• Non-canonical = where metadata specifies mutation, splice or PTM
• Metabo-target = metabolic interactions• Transport-target = transporters
![Page 9: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/9.jpg)
[9]
Drug-target Networks
![Page 10: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/10.jpg)
[10]
One target-to-many compounds: Dopamine Receptor D2
![Page 11: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/11.jpg)
[11]
One compound-to-(367)-proteins
![Page 12: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/12.jpg)
[12]
Mapping sources for the top selling drug
![Page 13: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/13.jpg)
[13]
Target Matrix for Atorvastatin
Swiss-Prot ChEMBL(BindingD
B)
TTD DrugBank
PubChem
HMDH_HUMAN X X X (PDB) X
HMDH_RAT X X
DPP4_HUMAN X
DPP4_PIG X
AHR_HUMAN X
![Page 14: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/14.jpg)
[14]
Other Statins:
Different BioAssay
Coverages
![Page 15: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/15.jpg)
[15]
Diferent PubChem CIDs map to different submissions, structures and activity profiles
Atorvastatin -> 10 CID name matches
Substances: 19 Links
Substances 397 Links Same structure: 33 Links Mixture: 364 LinksCID 60823 39 canonical
![Page 16: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/16.jpg)
[16]
Vice-versa, Compounds-to-target: HMG-CoA
![Page 17: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/17.jpg)
[17]
Drugs mapped to HMG-CoA as target
Swiss-Prot cross-reference
![Page 18: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/18.jpg)
[18]
Equivocal Mappings
![Page 19: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/19.jpg)
[19]
Swiss-Prot Target Intersects
• 1,627 results for database:(type:drugbank)• 297 results for database:(type:bindingdb)• 45 results for database:(type:bindingdb) AND database:
(type:drugbank) AND organism:"Homo sapiens
![Page 20: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/20.jpg)
[20]
Mixed Mappings
![Page 21: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/21.jpg)
[21]
Mannitol: drug ? yes - ligand ? yes ? target ? no
![Page 22: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/22.jpg)
[22]
Polypropylene Glycol: drug ? no, ligand ? maybe, target ? no
![Page 23: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/23.jpg)
[23]
E-2012: False-negative?
![Page 24: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/24.jpg)
[24]
Antifreeze: drug ?, no, ligand ? no, 154 targets ? no
Wikipedia: Ethylene glycol is moderately toxic with an oral LDLO = 786 mg/kg for humans
![Page 25: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/25.jpg)
[25]
Crowdsourcing Works !
![Page 26: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/26.jpg)
[26]
Curation Challenges
![Page 27: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/27.jpg)
[27]
Secretase matches in TTD
Mixed-concept targets but no small-molecule
true positives
![Page 28: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/28.jpg)
[28]
Gamma Secretase Activity: Variable Subunit Mappings
![Page 29: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/29.jpg)
[29]
APP: Indirect Target, three mechanisms
“for small molecules that suppress the Amyloid Precursor Protein (APP) translation by binding to the 5'Untranslated Region of the APP mRNA
![Page 30: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/30.jpg)
[30]
Proteasome: Target Descriptions and Cross-screens for Bortzemib
![Page 31: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/31.jpg)
[31]
PubChem Compound Intersects:Primary Drug Targets with Screening data
![Page 32: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/32.jpg)
[32]
Mycophenolic acid
![Page 33: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/33.jpg)
[33]
Mycophenolic acid and Prodrug: Complex mappings
• Primary Target human IMPDH2
• IMPDH1 ?
• IMPDH2 hamster
• IMPDH2 Tritrichomonas
• myfortic is an enteric-coated formulation of MPA in a delayed-release tablet.
![Page 34: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/34.jpg)
[34]
Conclusions
• Compared to what we had even a few years ago, let alone in LBPC (life-before-PubChem) these compound-to-protein sources are fantastic
• However, most things that could go wrong have• We don’t often see QC statistics • Data coverage is patchy, ad hoc and can be circular• If you operate on these data at large scale you have no choice
but to ”trust and filter” • If detailed realationships are important you need to ”verify and
judge” back to the primary source • You can only really do this if you have at least some in vitro
background rather than just in silico
![Page 35: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/35.jpg)
[35]
Wouldn’t it be nice if we had ....
• Interpreted mapping distribution statisitcs for each database • Details about extraction triages, curation rules and parsing logic• Harmonisation of mapping rules and cross-comparison of content• Clear declarations and statistics of circularity between databases• Curator judgments overuling document primacy• Consolidated and extended Swiss-Prot cross-references• Assay and target ontologies (Pistoia ? Open Phacts ?)• “Standardization of Enzyme Data” (STRENDA,
http://www.beilstein-institut.de/en/projekte/strenda/) • “Minimum Information About a Bioactive Entity” (MIABE,
http://www.psidev.info/index.php?q=node/394)
![Page 36: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/36.jpg)
[36]
Our Efforts
http://www.jcheminf.com/content/3/1/14
http://www.jcheminf.com/content/1/1/10
![Page 37: Comparison of Compounds-to-targets between Databases](https://reader035.vdocuments.site/reader035/viewer/2022070315/554e93aeb4c90526358b4fc0/html5/thumbnails/37.jpg)
[37]