standardizer molecular cosmetics for chemoinformatics györgy pirok nóra máte istván cseh...

24
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Upload: sean-gibbs

Post on 26-Mar-2015

220 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Standardizer

Molecular Cosmetics for Chemoinformatics

György PirokNóra MáteIstván CsehSzilárd DórántPéter KovácsSzabolcs CsepregiFerenc Csizmadia

Page 2: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Why standardize structures?

Canonicalisation Uniformization of structures without changing the chemical content to

recognize duplicates, functional groups (aromatization, mesomers, tautomers, ... )

Beautification Making the structures visually more attractive (dearomatization,

cleaning coordinates, wedge orientation, ... )

Modification Conversion of structures by modifying its original content as a

preparation step for further chemoinformatics tasks (transformations, removing stereo, removing R-groups, ...).

often difficult to categorize the standardization actions

Page 3: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Canonicalisation

making hydrogens explicit

converting to canonical mesomer form

transforming to user defined mesomer form

Hydrogens

aromatizing Kekülé rings

Resonant structures

converting to canonical tautomer form

removing user defined fragments

transforming to user defined tautomer form

Tautomers

expanding stoichiometry

Other

removing small fragments

making hydrogens implicit

setting the chiralflag

Page 4: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Mesomers

Page 5: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Tautomersoxo-enol, enamine-imine

Page 6: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Tautomerspyridone-pyridol

Page 7: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Fragment removal

Page 8: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Specific counterion removal

Page 9: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Solvent removal

Page 10: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Stoichiometry expansionexpanding salt stoichiometry

Page 11: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Stoichiometry expansionexpanding reaction stoichiometry

Page 12: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Beautification

calculating 2D coordinates

Hydrogens

converting aromaticrings to Kekülé format

Resonant structures

making hydrogens implicit

Cleaning

reallocating wedge bonds

contracting/expanding/ungrouping abbreviated and multiple groups

Groups

template based cleaning

3D geometry optimization

Page 13: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Template-based Cleaning2D-coordinate calculation of macrocycles or bridged systems

Page 14: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

query

Template-based Cleaningaligning search results to the query

Page 15: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

client

Canonicalization During Database Import

RelationalDatabase

input structures

canonicalization configuration original structurescanonicalized structures

server

StandardizerJChem Base/ Cartridge

Page 16: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

client

Sending Query to the Database

RelationalDatabase

serverquery structure

canonicalization configuration canonicalized queryquery is compared

to the canonicalized structures

StandardizerJChem Base/ Cartridge

Page 17: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Displaying Result Structures

RelationalDatabase

original structures

serverclient

beautification configuration

beautified structures

StandardizerJChem Base/ Cartridge

Page 18: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Modification

custom transformations+

Page 19: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

API and command line interface

Standardizer st = new Standardizer(new File("standardize.xml"));st.standardize(mol);

standardize input.sdf -c config.xml -o output.smiles

Page 20: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Live Demonstration

Page 21: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Applications: Virtual Synthesis

Page 22: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Applications: Structure Databases

Page 23: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

How can ChemAxon Help

Free for non commercial websites

Free for academic teaching and research“Academic Package”

Free Academic Package to be extended to cover academic networks – campus-wide roll out

Page 24: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Acknowledments

Ferenc Csizmadia Nóra Máté István Cseh Szabó Attila Szilárd Dóránt Péter Kovács Szabolcs Csepregi