connecting chemistry across the internet using chemspider
TRANSCRIPT
![Page 1: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/1.jpg)
Connecting Chemistry Across the Internet Using ChemSpider
Antony J Williams and Valery TkachenkoSERMACS, November 15th 2012
![Page 2: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/2.jpg)
Chemistry Data and the Weeds
![Page 3: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/3.jpg)
Tell me about Roundup
![Page 4: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/4.jpg)
So what is Round Up?
![Page 5: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/5.jpg)
The World’s Encyclopedia
![Page 6: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/6.jpg)
Roundup
![Page 7: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/7.jpg)
Where do we Round Up data?
Where can I find the molfile for Roundup? Papers/Patents about Roundup? What are the side effects of Roundup? Where can I order Roundup? What are the physicochemical properties? Metabolic pathways? Different synonyms of Roundup? Synthesis of Roundup? Side effects of Roundup? Etc….
![Page 8: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/8.jpg)
Where do I Round Up Data?
![Page 9: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/9.jpg)
![Page 10: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/10.jpg)
In an increasing LinkedData map….
![Page 11: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/11.jpg)
But I want to aggregate data? So…
![Page 12: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/12.jpg)
ChemSpider
Takes on the role of a structure centric hub:
Connecting, validating, qualifying data Enhancing data with connections to services Provides access to data and services for others
to use (Thermo, Agilent, Bruker, Waters, ACD/Labs, Accelrys, etc.)
Uses available services to integrate, connect and enhance the offering
![Page 13: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/13.jpg)
Roundup on ChemSpider
![Page 14: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/14.jpg)
What will ChemSpider give us??
![Page 15: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/15.jpg)
What will ChemSpider give us??
![Page 16: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/16.jpg)
What will ChemSpider give us??
![Page 17: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/17.jpg)
What will ChemSpider give us??
![Page 18: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/18.jpg)
What will ChemSpider give us??
![Page 19: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/19.jpg)
What will ChemSpider give us??
![Page 20: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/20.jpg)
ChemSpider is Collapsing Data???
![Page 21: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/21.jpg)
What will ChemSpider give us??
![Page 22: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/22.jpg)
For Glyphosate itself
![Page 23: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/23.jpg)
How did we build it?
We deal in Molfiles or SDF files – with coordinates Deposit anything that has an InChI – we support
what InChI can handle, good and bad Standardization based on “InChI standardization” InChIs aggregate (certain) tautomers
How much of ChemSpider is “on ChemSpider”?
![Page 24: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/24.jpg)
Connecting Chemistry across the web
So much of what is seen on ChemSpider is retrieved in real time using services
![Page 25: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/25.jpg)
Connecting Chemistry across the web
![Page 26: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/26.jpg)
Online Predictions
![Page 27: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/27.jpg)
A Comment on Quality
For >28 million chemical compounds there are some errors:
“Incorrect” structure representations Mismatched name-structure relationships Experimental properties (the values, the units) Real vs. virtual compounds – text-mining and
conversion
We have deprecated a LOT of data…
![Page 28: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/28.jpg)
Downsides of InChI
Good for small molecules – but no polymers, issues with inorganics, organometallics, imperfect stereochemistry. ChemSpider is “small molecules”
InChI used as the “deduplicator” – FIRST version of a compound into the database becomes THE structure to deduplicate against…
![Page 29: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/29.jpg)
Side Effects of InChI Usage
![Page 30: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/30.jpg)
SMILES by comparison…
![Page 31: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/31.jpg)
Side Effects of InChI Usage
![Page 32: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/32.jpg)
Standardization IssuesDepiction based on molfile
![Page 33: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/33.jpg)
Downsides of Overall Approach
Meshing data together based on InChIs worked for simple molecules
2D layout errors inherited or limited by algorithm
Complex molecules that are meant to be the same thing were NOT deduplicated. Compounds differing by one stereocenter, named the same, meant to be the same, are not the same
![Page 34: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/34.jpg)
So much data online is “erroneous”
![Page 35: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/35.jpg)
The confusion of name-structures
![Page 36: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/36.jpg)
Collapsing Data – Standardization
![Page 37: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/37.jpg)
What needs to happen?
If we could validate Catch errors in databases (and clean) Proactively catch errors in publications/patents Reduce junk in the ether – improve QUALITY!
If we collectively standardized Interlinking between databases should improve
CVSP – a separate presentation….stick around
![Page 38: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/38.jpg)
Crowdsourcing ChemSpider
ChemSpider is crowdsourced
Community deposition, annotation and curation
Anyone can “Leave Feedback”
Registered users can add data
![Page 39: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/39.jpg)
Internet Data
ChemSpider and Global Chemistry Hub
Commercial SoftwarePre-competitive Data
Open ScienceOpen DataPublishersEducators
Open DatabasesChemical Vendors
Small organic moleculesUndefined materialsOrganometallicsNanomaterialsPolymersMineralsParticle boundLinks to Biologicals
![Page 40: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/40.jpg)
Delivering a Prediction Platform Experimental data will be used as the basis of
model generation – a predictive platform…
![Page 41: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/41.jpg)
The Future of ChemSpider Continued focus on quality over quantity –
but more data is good too! ChemSpider Reactions – work in progress
and includes >300,000 reactions Plugging in a validation and standardization
platform Delivering personal and institutional
repository capabilities
![Page 42: Connecting Chemistry Across the Internet Using ChemSpider](https://reader036.vdocuments.site/reader036/viewer/2022062319/554e7e5fb4c90545698b519d/html5/thumbnails/42.jpg)
Thank you
Email: [email protected] Twitter: ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams