chemspider and traveling the internet via chemical structures cheminformatics presentation
DESCRIPTION
This is a short presentation given to chemistry students at Drexel University as a remote presentation. This was for the class of Jean-Claude Bradley.TRANSCRIPT
![Page 1: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/1.jpg)
ChemSpider and Traveling the Internet via Chemical Structures
Antony WilliamsDrexel University, November 2012
![Page 2: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/2.jpg)
Compounds and Identifiers
![Page 3: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/3.jpg)
Chemistry on the Internet
Where do you source chemistry information? What can you trust online? How can you recognize potential issues? Cross-referencing and curating data
![Page 4: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/4.jpg)
Molfiles (http://en.wikipedia.org/wiki/Chemical_table_file)
![Page 5: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/5.jpg)
Molfiles 10 9 0 0 1 0 0 0 0 0 1 V2000 31.2937 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6526 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 31.2937 -7.7066 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -9.6877 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 25.5096 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 28.9731 -9.0366 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 27.8163 -9.7016 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 26.6664 -7.7066 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 32.4367 -9.6877 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 30.1161 -11.0177 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3 1 2 0 0 0 0 4 1 1 0 0 0 0 9 1 1 0 0 0 0 7 2 1 0 0 0 0 5 2 2 0 0 0 0 8 2 1 0 0 0 0 6 4 1 0 0 0 0 4 10 1 6 0 0 0 7 6 1 0 0 0 0 M END
![Page 6: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/6.jpg)
Molfiles Molfiles are the primary exchange format between
structure drawing packages Can be different between different drawing packages Most commonly carry X,Y coordinates for layout Can support polymers, organometallics, etc. Can carry 3D coordinates
![Page 7: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/7.jpg)
SMILES (http://en.wikipedia.org/wiki/SMILES)
SMILES is a common format Can support polymers,
organometallics, etc. Does NOT carry X,Y or Z
coordinates for layout so requires layout algorithms – can be problematic!
Generally different between drawing packages
![Page 8: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/8.jpg)
Stereo
![Page 9: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/9.jpg)
Tautomers
![Page 10: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/10.jpg)
SMILES ACD/Labs CC(C)CCC[C@@H](C)CCC[C@@H](C)CCCC(\
C)=C\CC2=C(C)C(=O)c1ccccc1C2=O
OpenEye CC1=C(C(=O)c2ccccc2C1=O)C/C=C(\C)/
CCC[C@H](C)CCC[C@H](C)CCCC(C)C
ChEMBL CC(C)CCC[C@@H](C)CCC[C@@H](C)CCC\
C(=C\CC1=C(C)C(=O)c2ccccc2C1=O)\C
![Page 11: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/11.jpg)
The InChI Identifier
![Page 12: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/12.jpg)
InChI
SINGLE code base managed by IUPAC – integrated into drawing packages. No variability as with SMILES
InChI Strings can be reversed to structures – same problem as with SMILES – no layout
Well adopted by the community (databases, publishers, blogs, Wikipedia) – good for searching the internet
![Page 13: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/13.jpg)
The InChI Standard
![Page 14: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/14.jpg)
Tautomers – “Mobile H Perception”
![Page 15: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/15.jpg)
Double Bond Orientation
![Page 16: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/16.jpg)
Stereo
![Page 17: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/17.jpg)
Checking for Stereochemistry
![Page 18: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/18.jpg)
Checking for StereochemistryUse your drawing package!
![Page 19: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/19.jpg)
Checking for Stereochemistry
![Page 20: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/20.jpg)
Checking for Stereochemistry
![Page 21: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/21.jpg)
Checking for Stereochemistry
![Page 22: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/22.jpg)
InChIKeysSearch the Web by Structure
![Page 23: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/23.jpg)
InChIs
![Page 24: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/24.jpg)
Databases and Standardization
![Page 25: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/25.jpg)
Databases and Standardization
![Page 26: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/26.jpg)
InChI
No support for polymers, organometallics
Many option settings can lead to variability and make integration across databases difficult – FixedH option especially problematic
“Slight” chance of collisions of InChIKeys
VERY USEFUL FOR INTEGRATING THE WEB
![Page 27: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/27.jpg)
Vancomycin
![Page 28: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/28.jpg)
Vancomycin
Search Molecular SKELETON
Search Full Molecule
![Page 29: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/29.jpg)
Full Skeleton Search: 104 Hits
![Page 30: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/30.jpg)
Full Molecule Search: 4 Hits
![Page 31: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/31.jpg)
Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science
![Page 32: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/32.jpg)
www.chemspider.com
![Page 33: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/33.jpg)
How do we build it?
We deal in Molfiles or SDF files – with coordinates
Valence checking, charge imbalance
We have our own “business logic” to standardize
InChI to “aggregate tautomers” to one record
We link out to external sites using their IDs
![Page 34: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/34.jpg)
Searches: The INTERNET
All ChemSpider and Internet searches are “simply algorithms” but synonym searching is based on an assertion
![Page 35: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/35.jpg)
Validated Names for Searching…
![Page 36: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/36.jpg)
Validating structures
Check for “full stereo” and use stereo descriptors especially for checking!
Check for quality of associated data sources
Check against reference literature when available – but it can be wrong
Question EVERYTHING!
![Page 37: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/37.jpg)
Contributing to The Quality of DataWhat is the Structure of Vitamin K?
![Page 38: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/38.jpg)
Contributing to The Quality of DataWhat is the Structure of Vitamin K?
A lipid cofactor that is required for normal blood clotting. Several forms of vitamin K have been identified: VITAMIN K1 (phytomenadione) derived from plants, VITAMIN K2 (menaquinone) from bacteria & synthetic naphthoquinone provitamins, VITAMIN K3 (menadione).
![Page 39: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/39.jpg)
What is the Structure of Vitamin K1?
![Page 40: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/40.jpg)
CAS’s Common Chemistry
![Page 41: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/41.jpg)
Wikipedia
![Page 42: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/42.jpg)
Wolfram Alpha
![Page 43: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/43.jpg)
DailyMed
![Page 44: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/44.jpg)
![Page 45: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/45.jpg)
ALL Different, ALL “Domoic Acids”
![Page 46: ChemSpider and Traveling the Internet via Chemical Structures Cheminformatics Presentation](https://reader035.vdocuments.site/reader035/viewer/2022062303/554ead97b4c905fb7c8b4f0e/html5/thumbnails/46.jpg)
Thank you
Email: [email protected] Twitter: ChemConnectorBlog: www.chemspider.com/blogPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams