history and current status of chemoinformaticss

Upload: anand-maurya

Post on 02-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    1/10

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    2/10

    specialised hardware. Also rapid access to not only the primary literature but,

    possibly even more importantly, to the factual primary data about millions of

    chemical compounds, to reactions, structures, and spectra, and to the genomic

    data of various organisms including humans, can only be provided by digital

    storage and retrieval techniques.

    The roots of what we now call cheminformatics began very early in the history of

    computing: 1950's for statistical models, 1960's for first computer

    representations, mainly by curious chemists. However the term

    "cheminformatics" wasn't adopted until the early 1990's (the spelling of this -

    cheminformatics or chemoinformatics - is still in dispute). The bulk of the

    foundational work was done in the 70's and 80's, and was strongly supported by

    the pharmaceutical industry and the need for computational drug discovery

    research.

    The term chemoinformatics was defined by F.K. Brown in 1998. With the advent

    of computers and the ability to store and retrieve chemical information, serious

    efforts to compile relevant databases and construct information retrieval systems

    began. One of the first efforts to have substantial long term impact was to collect

    crystal structure information for small molecules by Olga Kennard.

    Chemoinformatics is the mixing of those information resources to transform datainto information and information into knowledge for the intended purpose of

    making better decisions faster in the area of drug lead identification and

    optimization. Since then, both spellings have been used, and some have evolved

    to be established as Cheminformatics, while European Academia settled in 2006

    for Chemoinformatics.

    Thefirst, and still the core, journal for the subject, The Journal of Chemical

    Documentation, started in 1961( the name changed to The Journal of

    Information and Computer Science in 1975)

    The first book appeared in 1971 (Lynch, Harrison, Town and Ash, Computer

    Handling of Chemical Structure Information)

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    3/10

    The first international conference on the subject was held in 1973 at

    Noordwijkerhout and every three years since 1987.

    HOW CAN CHEMOINFORMATICS HELP ??

    Cheminformatics can help chemists and other scientists produce andmanage information. In silico analysis using cheminformatics

    techniques can actually reduce the risks of developing a drug. Such

    techniqes as virtual screening, library design, and docking figure into

    the analysis. Physical properties that might have an impact on whether

    a substance could potentially be developed as a drug are often

    examined in cheminformatics as features that can be compared among

    large numbers of substances. An example is clogP, a measure of the

    amount of fattiness in the system. Sometimes, inferences can be drawn

    about a related set of properties, as when Chris Lipinski formulated his

    now famous Rule of Five that says that compounds which are drug-like

    tend to have 5 or fewer hydrogen donor atoms, 10 or fewer hydrogen

    acceptor atoms, calculated logP less than or equal to 5, and molecular

    weight up to 500. Compounds that exhibit greater than these values

    tend to have poor absorption or permeation.

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    4/10

    CURRENT STATUS OF

    CHEMOINFORMATICS

    In recent years, there has been an explosion in the availability of publicly

    accessible chemical information, including chemical structures of small molecules,

    structure-derived properties and associated biological activities in a variety of

    assays. These data sources present us with a significant opportunity to develop

    and apply computational tools to extract and understand the underlying

    structure-activity relationships. Furthermore, by integrating chemical data

    sources with biological information (protein structure, gene expression and so

    on), we can attempt to build up a holistic view of the effects of small molecules in

    biological systems. Equally important is the ability for non-experts to access and

    utilize state of the art cheminformatics method and models.

    The chemoinformatics field continues to evolve at the interface between

    computer science and chemistry. Chemical information and computational

    approaches in pharmaceutical research are major focal points of

    chemoinformatics. However, the boundaries of this discipline are rather fluid and

    the chemoinformatics spectrum is difficult to delineate.

    In the area of methodology development, recent work on characterizing

    structure-activity landscapes, Quantitative Structure Activity Relationship (QSAR)

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    5/10

    model domain applicability and the use of chemical similarity in text mining has

    been done. In the area of infrastructure, a distributed web services framework

    that allows easy deployment and uniform access to computational (statistics,

    cheminformatics and computational chemistry) methods, data and modelshas

    been done. The development of PubChem derived databases and highlight

    techniques allow us to scale the infrastructure to extremely large compound

    collections, by use of distributed processing on Grids.

    SCOPE OF CHEMOINFORMATICS Representation and structure searching

    Substructure searching

    Similarity searching, clustering Diversity analysis

    Searching databases

    Computer-aided structure elucidation

    3-D substructure searching QSAR and Docking

    APPLICATIONS OF CHEMOINFORMATICS

    Storage and retrieval

    The primary application of cheminformatics is in the storage, indexing and search

    of information relating to compounds. The efficient search of such stored

    information includes topics that are dealt with in computer science as data

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    6/10

    mining, information retrieval, information extraction and machine learning.

    Related research topics include:

    Unstructured data

    Information retrieval

    Information extraction

    Structured Data Mining and mining of Structured data

    Database mining

    Graph mining

    Molecule mining

    Sequence mining

    Tree mining

    Digital libraries

    File formats

    The in silico representation of chemical structures uses specialized formats suchas the XML-based Chemical Markup Language or SMILES. These representations

    are often used for storage in large chemical databases. While some formats are

    suited for visual representations in 2 or 3 dimensions, others are more suited for

    studying physical interactions, modeling and docking studies.

    Virtual libraries

    Chemical data can pertain to real or virtual molecules. Virtual libraries of

    compounds may be generated in various ways to explore chemical space and

    hypothesize novel compounds with desired properties.

    Virtual libraries of classes of compounds (drugs, natural products, diversity-

    oriented synthetic products) were recently generated using the FOG (fragment

    optimized growth) algorithm. [9] This was done by using cheminformatic tools to

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    7/10

    train transition probabilities of a Markov chain on authentic classes of

    compounds, and then using the Markov chain to generate novel compounds that

    were similar to the training database.

    Virtual screening

    In contrast to high-throughput screening, virtual screening involves

    computationally screening in silico libraries of compounds, by means of various

    methods such as docking, to identify members likely to possess desired properties

    such as biological activity against a given target. In some cases, combinatorial

    chemistry is used in the development of the library to increase the efficiency in

    mining the chemical space. More commonly, a diverse library of small molecules

    or natural products is screened.

    Quantitative structure-activity relationship (QSAR)

    This is the calculation of quantitative structure-activity relationship and

    quantitative structure property relationship values, used to predict the activity of

    compounds from their structures. In this context there is also a strong

    relationship to Chemometrics. Chemical expert systems are also relevant, since

    they represent parts of chemical knowledge as an in silico representation.

    MORE ABOUT CHEMOINFORMATICS

    Implementing, handling and searching chemical databases is a crucial aspect of

    chemoinformatics . Chemical database techniques and data mining methods will

    improve as this field evolves, also due to more implementation of new data

    structures . Methods for full text data mining are likely to be become very

    powerful in the years to come, and will presumably play a highly important role in

    the general area of chemoinformatics.

    Figure 1 shows the number of references to the words bioinformatics,

    chemoinformatics, chemogenomics and metabonomics in PubMed from 1992

    to 2004. It is seen that the present trend for chemoinformatics resembles the

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    8/10

    trend in bioinformatics five to ten years ago. It should be mentioned that this

    graph is based on one database only, PubMed, and is intended to give an idea

    about the development in publishing frequency in these areas, and not as a

    complete overview.

    RECENT ADVANCES IN THE AREA OF DRUG DESIGNING WITH

    THE HELP OF CHEMOINFORMATICS

    It is clear that the drug discovery and optimization process is

    undergoing very significant changes. Many more hits are found than

    previously, especially due to the advances gained in combinatorial

    chemistry and high throughput screening (HTS). The approach used in

    drug discovery has been linear with respect to various relevant

    properties, but more parallel approaches are evolving, where not only

    the potency (activity) and selectivity of the lead is examined at an early

    stage, but also other key properties. Many of the compounds drawn

    out of combinatorial libraries may look promising at first, but they fail

    at later stages in the drug discovery process due to undesired

    properties. A compound can for example be feasible based onmolecular structure, but due to aggregation, limited solubility or limited

    uptake in the human organism it is not useful as a drug. Many

    pharmaceutical companies might even be repeating the same mistakes,

    due to these problems. Methods for assessing these properties at a

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    9/10

    very early stage, both experimentally and computationally, are thus

    highly desirable. This is expected to lower the cost of drug discovery

    and optimization significantly, and hopefully provide an increased

    number of useful leads. In cases where a lead has sufficiently highactivity, but various properties need to be improved, chemoinformatics

    methods could be used to modify substructures within the lead space

    with minimal effect on the activity profile.

    Likewise, various computational methods are evolving rapidly at present.

    Computational techniques used to search through chemical libraries and

    databases, so-called virtual screening methods, have become increasingly popular

    in drug discovery. A whole range of computational techniques are used for

    searching for molecular similarities and dissimilarities, for extracting information

    about pharmacophores (structural models of targets or binding sites) from

    compound libraries, for prediction of properties, for studying molecular

    interactions at the atomic level, among other things Chemoinformatics is strongly

  • 8/10/2019 HISTORY AND CURRENT STATUS OF CHEMOINFORMATICSs

    10/10

    linked to computational chemistry and molecular modeling. Molecular modeling

    methods are particularly useful for conducting conformational analysis of

    molecules, and for accessing the strength of intermolecular interactions.

    Newly established fields like chemogenomics (or chemical genomics),metabonomics and metabolomics also play increasingly important roles in

    modern drug discovery and development. Chemogenomics (Browne et al., 2002)

    deals with interactions between chemical compounds and living systems in terms

    of induced genomic response. In metabonomics (Nicholson and Wilson, 2003)

    relatively low-molecular weight materials produced during genomic expression

    within a cell are studied, normally by use of1 H-NMR spectroscopy and

    multivariate data analysis (chemometrics) (Geladi and Kowalski 1986b,a). It has

    been shown to be a useful tool for understanding drug efficacy and toxicity.

    Metabolomics is similar to metabonomics, but where metabonomics deals with

    integrated, multicellular, biological systems, metabolomics deals with simple cell

    systems.

    CONCLUSION

    Chemoinformatics is a rapidly growing field, with a huge application potential.

    Chemoinformatics concerns the gathering and systematic use of chemical

    information, and the use of those data to predict the behavior of unknown

    compounds in silico.