pharmacophore definition and 3d searches

TECHNOLOGIES

DRUG DISCOVERY

TODAY

Drug Discovery Today: Technologies Vol. 1, No. 3 2004

Editors-in-Chief

Kelvin Lam – Pfizer, Inc., USA

Henk Timmerman – Vrije Universiteit, The Netherlands

Lead optimization

Pharmacophore definition and 3DsearchesT. Langer1,*, G. Wolber2

1Computer Aided Molecular Design Group, Institute of Pharmacy, University of Innsbruck, Innrain 52, A-6020 Innsbruck, Austria2Inte:Ligand GmbH, Clemens Maria Hofbauer-G. 6, A-2344 Maria Enzersdorf, Austria

The most common pharmacophore building concepts

based on either 3D structure of the target or ligand

information are discussed together with the applica-

tion of such models as queries for 3D database search.

An overview of the key techniques available on the

market is given and differences with respect to algo-

rithms used and performance obtained are highlighted.

Pharmacophore modelling and 3D database search are

shown to be successful tools for enriching screening

experiments aimed at the discovery of novel bio-active

compounds.

*Corresponding author: (T. Langer) [email protected]: http://pharmazie.uibk.ac.at/CAMD

1740-6749/$ � 2004 Elsevier Ltd. All rights reserved. DOI: 10.1016/j.ddtec.2004.11.015

Section Editor:Hugo Kubiniyi – University of Heidelberg, Germany

Pharmacophore models are hypotheses on the 3D arrangement ofstructural properties, such as hydrogen bond donar and acceptor

properties, hydrophobic groups and aromatic rings of compounds thatbind to a biological target. In the presence of the 3D structure of this

target of by comparison with inactive analogs, further geometric and/orsteric constraints can be defined. The article describes and evaluates

strategies and commercial software for pharmacophore definition,starting from the 3D structures of ligand-protein complexes or from

ligands alone. Once a pharmacophore model is established, 3D searchesin large databases can be performed, leading to a significant enrichment

of active analogs.

Introduction

The key goal of computer-aided molecular design methods in

modern medicinal chemistry is to reduce the overall cost

associated with the discovery and development of a new

drug, by identifying the most promising candidates to focus

on the experimental efforts. Often, drug discovery projects

have reached already a well-advanced stage before detailed

structural data on the target has become available. Experi-

mental screening for lead structure determination suffers

from limitation with respect to the possible number of com-

pounds that can be submitted to a high-throughput bio-assay

and with the low number of hits obtained that is in the range

of 0.1% [1]. Within this context, the pharmacophore

approach has proven to be successful, allowing (i) the percep-

tion and understanding of key interactions between a target

and a ligand and (ii) the enrichment of hit rates obtained in

experimental screening of subsets that have been obtained

from in silico screening experiments (Fig. 1) [2].

Key technologies – structure-based pharmacophores

A pharmacophore (pharmacophore model, pharmacophoric

pattern) can be considered as the ensemble of steric and

electrostatic features of different compounds which are

necessary to ensure optimal supramolecular interactions with

a specific biological target structure and to trigger or to block

its biological response [3]. Feature-based pharmacophores

have turned out to be the most effective type of pharmaco-

phore models and the utility of such models as queries for 3D

database search has been reviewed recently [4,5]. The

strength of this type of pharmacophore models is the general

definition of the pharmacophoric points. The chemical func-

tion character allows searching for very diverse structural

scaffolds because multiple structural elements can express

the same chemical function. Pharmacophore key elements

might be a group of atoms, a part of the volume of the

www.drugdiscoverytoday.com 203

Drug Discovery Today: Technologies | Lead optimization Vol. 1, No. 3 2004

Figure 1. Typical pharmacophore-based virtual screening workflow.

molecule, ‘classical’ pharmacophoric features like H-bond

acceptors (HBA) and donors (HBD), charged or ionizable

groups, hydrophobic (HY) and/or aromatic rings (RA)

together with geometrical constraints like distances, angles,

and dihedral angles. The set of these features is termed a

pharmacophoric ‘model’ or ‘hypothesis’. There are different

possibilities to derive pharmacophores models: The way to

determine a 3D pharmacophore is mainly based on the

availability of the three-dimensional structure of the binding

site of the target. When the 3D structure of the target has been

characterized, and when a certain number of ligands (with or

without associated binding affinity) are available, pharmaco-

phore models can be generated directly from the complex

structure of the ligand and the target. Using the LigandScout

program, [6] available from Inte:Ligand GmbH (http://

www.inteligand.com/), is one possibility to derive automati-

cally a feature-based pharmacophore model from a ligand–

target complex structure. In this program, the first step is the

assignment of ligand information on hybridization status

and bond characteristics that is not present in the input data

files from the Protein Databank [7] by using an extended

heuristic approach together with template-based numeric

analysis. Feature-based pharmacophores are then generated

by determining interactions between ligand and target atoms

on the basis of H-bond formation, charge and hydrophobic

contact. These models can be then refined according to

binding data or several models can be combined into one

204 www.drugdiscoverytoday.com

common feature pharmacophore. The capability of searching

3D databases will be implemented shortly. If only 3D infor-

mation on the binding site is available without a ligand

interacting, another approach to derive a pharmacophore

model can be undertaken: Using the structure-based focusing

(SBF) technique within the Cerius2 software package [8],

available from Accelrys Inc (http://www.accelrys.com/)

allows the construction of binding-site pharmacophore

hypotheses. The procedure is mainly based on (i) calculation

of interaction sites using the algorithms defined in the LUDI

program [9], (ii) clustering of the vectors for H-bonding

donating and accepting groups and of the hydrophobic

regions, and (iii) transformation of the obtained clusters into

a feature-based pharmacophore hypothesis representing the

HBA, HBD, and HY functions. The Unity program [10],

available from Tripos Inc (http://www.tripos.com/) also

allows the construction of structural pharmacophore queries

based on molecules, molecular fragments, or receptor sites. In

addition to atoms and bonds, 3D queries can include features

such as lines, planes, centroids, extension points, hydrogen

bond sites, and hydrophobic sites. Distance, angle, excluded

volume, surface volume, and spatial constraints define the

geometric relationships between features. In the molecular

operating environment MOE (Chemical Computing Group,

http://www.chemcomp.com/) [11], 3D pharmacophore

queries can contain locations of features or chemical groups

as well as restrictions on shape. Restrictions on shape can be

http://www.inteligand.com/

http://www.inteligand.com/

http://www.accelrys.com/

http://www.tripos.com/

http://www.chemcomp.com/

Vol. 1, No. 3 2004 Drug Discovery Today: Technologies | Lead optimization

imposed by specifying the included and/or excluded volume

areas. In MOE, the position and the shape of the volume are

defined by a single sphere or by the union of several spheres.

Additionally, a consensus query from not one but a set of

aligned molecules can be used for the 3D-pharmacophore

database search which provides high control, offering both

partial and systematic matching as well as flexible matching

rules.

Key technologies: ligand-based pharmacophores

If only ligand information is available, the identification of a

pharmacophore, in principle, involves two steps: (i) the

analysis of the training set molecules itself to identify phar-

macophoric features, and (ii) the alignment of the assumed

bio-active conformations of the molecules to determine the

best overlay of corresponding features. Conformational flex-

ibility actually represents one of the main difficulties in

pharmacophore generation, because the bio-active confor-

mations of the molecules are usually not known. Several

programs are available for building pharmacophores from

ligand information: Catalyst [12], available from Accelrys Inc,

is by far the most used one, because it offers large flexibility

during pharmacophore generation together with integrated

high-speed 3D database searching capability. Other success-

ful programs are DiscoTech [13], and Gasp [14], both from

Tripos Inc. The main differences between the programs lie in

the algorithms used for the alignment and in the way in

which the conformational flexibility is handled, and how 3D

database search is performed. In Catalyst, conformational

flexibility is handled by computing a series of low-energy

conformers for each molecule using a randomized search

algorithm together with a poling function allowing an exten-

sive coverage of the conformational space. Two major auto-

matic modes for pharmacophore model generation are

implemented: the algorithm for quantitative models Hypo-

Gen and the builder for purely qualitative, that is, common

feature models, HipHop. In the first step, Catalyst checks

surface accessibility of molecules available for receptor inter-

action and then defines the position of different features by

comparison of absolute coordinates of all conformations

stored for the training set molecules rather than by inter-

feature distances. Model building is started with examination

of the two most active molecules given in the training set, and

all possible pharmacophore hypotheses based on the features

available in these both molecules are enumerated. Following

steps reduce the numbers of hypotheses to be considered by

omitting those models that cannot explain the actual bioac-

tivity data by geometric fitting of the molecular structures to

the chemical features. In quantitative models, each chemical

function includes a weight descriptor that is related to its

relative importance in conferring the activity. Catalyst con-

structs multiple hypotheses that can explain and validate the

structure/activity data in a chemically reasonable fashion.

The program provides the ability to cluster and merge

hypotheses to develop more comprehensive models and

can process numbers of conformations up to 255 per com-

pound. In Disco [15], which is the basis for the commercial

product DiscoTech [13], each molecule is characterized by

ligand points and site points. The ligand points include atoms

with hydrogen bond donor, hydrogen bond acceptor, and

hydrophobic character, or negative charge, or positive

charge. Site points represent the hypothetical position of

complementary atoms in the binding site and are determined

from the position of heavy atoms in the ligand structure.

Conformational flexibility in this case is handled by precom-

puting a series of low-energy conformers for each molecule

with each conformer being treated as a rigid body during the

alignment step. A conformer is represented by the interpoint

distances calculated for the ligand and site points and a clique

detection algorithms used to align structures based on these

distances. In Disco, the molecule with the fewest conforma-

tions, following the active analogue approach paradigm [16],

is used as a reference molecule. The output from a Disco run is

a ranked list of all possible pharmacophore mappings where

each feature of a pharmacophore must be present in all the

molecules. This requirement might result in good pharma-

cophores being missed; hence, Disco has the option of find-

ing solutions where some molecules are excluded from the

model. Gasp [14] is based on a genetic algorithm (GA) and

differs from both Catalyst and Disco in its handling of the

conformational problem: Each molecule is input as a single

conformation and conformational analysis together with

random rotations and a random translation are applied on-

the-fly before any superimposition is made. The pharmaco-

phoric features (hydrogen bond donor protons, acceptor

lone-pairs, and ring centers including projected site points,

however, no charges) are determined in all compounds and

the molecule with the least number of features is chosen as

the base molecule to which the other molecules are fitted.

Within the GA, the chromosomes encode the angles of rota-

tion of the rotatable bonds in all of the molecules and the

mapping of the pharmacophoric features in the base mole-

cule to corresponding features in each of the other molecules.

The fitness function first generates conformations for each

molecule and then uses a least-squares procedure to overlay

each molecule onto the base molecule using the mappings.

Fitness is calculated as a combination of the similarity and the

number of the overlaid features, together with the volume

integral of the overlay. Genetic operators attempt to generate

solutions that maximise the fitness function and thus

correspond to the best possible structural overlay. Gasp big-

gest strength over Disco and Catalyst is that it considers

steric overlap of the ligands during pharmacophore

model generation, whereas the latter two only attempt at

matching pharmacophore features without taking shape into

account.


Drug Discovery Today: Technologies | Lead optimization Vol. 1, No. 3 2004

Figure 2. 3D Database search strategies.

In a recent paper, results obtained with Catalyst/HipHop,

Gasp, and Disco have been compared and discussed in detail

[17], indicating that Catalyst and Gasp clearly outperform

Disco at reproducing the five target pharmacophores

described in this study. Catalyst and GASP were found to

provide almost equivalent performance even though the

results were not consistent for all the data sets. A very notable

result is that, for both programs, the target pharmacophores

were found within the first 10 solutions in four out of five data

sets. Gasp was found inherently simpler than Catalyst, how-

ever, the latter providing much more flexibility in setting and

tuning parameters. The biggest advantage of Catalyst over

Gasp is that the pharmacophoric features might be custo-

mized according the requirements of the training set under

investigation.

3D database searching

After having generated a pharmacophore model, there are

two ways to identify new molecules which share its features

and can thus exhibit a desired biological response. First,

there is de novo design. This approach seeks to link the parts

of the pharmacophore together with fragments to generate

molecular structures that are chemically reasonable and

novel. The second method is to perform 3D database phar-

macophore searching, providing the main advantage over

de novo design that one is capable of identifying molecules

which can be obtained from corporate compound libraries

or can be synthesized using a well-established protocol. In

the ideal case, 3D database search is able to identify com-

pounds exhibiting properties outside those of the set of

compounds used for building the pharmacophore allowing

the identification of novel chemical structures and mole-

cular features (termed as scaffold hopping, or lead hopping,

respectively). Technically, there are two possibilities to

search 3D molecular databases with pharmacophore

models (Fig. 2): firstly, using a database file format con-

taining a set of well pre-computed conformations, thus

speeding up the search procedure; secondly, calculate con-

formers on-the-fly and perform the fitting analysis subse-

quently. The latter approach has the advantage that mass

storage capacity is not relevant, which has been an issue for

a long time when using multiconformer databases. By con-

trast, using pre-computed conformations for pharmaco-

phore fitting has been demonstrated to outperform the

on-the-fly calculation approach. In Catalyst, both methods

are possible, because normally, Catalyst databases are

stored in multiconformational data format, however,

permitting additional on-the-fly conformational tuning

while fitting molecules to a pharmacophore model. This

allows the searching of large databases, containing up to

several millions of compounds, within a time frame of few

minutes.

206 www.drugdiscoverytoday.com

Strategy comparison

The key players on the market for pharmacophore-based 3D

database search are Accelrys, Tripos, and the Chemical Com-

puting Group, and their software solutions have been dis-

cussed in the previous section. Additional programs are

available on the market, including C@rol, [18] available from

Molecular Networks GmbH (http://www.molnet.de/), Fea-

ture Trees, [19] available from BioSolveIT GmbH (http://

www.biosolveit.de/), and several academic prototypes

described in recent literature review [20]. All commercial

packages allow, more or less, efficient pharmacophore con-

struction and 3D database search. The MOE system (Chemi-

cal Computing Group) is a highly integrated, however, easily

customizable molecular modelling environment, in which

the pharmacophore approach is well embedded. The corre-

sponding Tripos product, Sybyl, contains the modules Gasp

and Disco for pharmacophore building; for 3D database

search, the integrated system Unity is to be used. Both

http://www.molnet.de/

http://www.biosolveit.de/

http://www.biosolveit.de/

Vol. 1, No. 3 2004 Drug Discovery Today: Technologies | Lead optimization

environments are also well integrated and the Sybyl Program-

ming Language (SPL) enables the users to automate many

procedures, including analysis of pharmacophores, hit lists,

etc. The highest performance concerning 3D database search

speed and pharmacophore model customization is offered by

Accelrys’ products Catalyst and Cerius2, however, the cum-

bersome graphical interface of the former has been often

criticized [17]. Also the integration between the Accelrys

products is much lower than that offered by products dis-

tributed by their competitors. The choice which software will

be used for a pharmacophore generation and 3D database

search job might depend more on the flavour of the user than

on hard facts based on possibilities offered by the different

packages. In certain well-defined areas, the programs of the

small software companies will offer better solutions than

those of the key players. The success of such spin-off pro-

grams will probably highly depend on the capability of being

integrated into an existing workflow within the drug discov-

ery and development process.

Conclusions

The pharmacophore concept has proven to be extremely

successful, not only in rationalizing structure-activity rela-

tionships, but also by its large impact in developing the

appropriate 3D-tools for efficient virtual screening. Profiling

of combinatorial libraries and compound classification are

other often-used applications of this concept.

The prior use of pharmacophore models in biological

screening of compounds is an efficient procedure, because

it eliminates quickly molecules that do not possess the

required features thus leading to a dramatic increase of

enrichment, when compared to a purely random screening

experiment. One should not forget, however, that additional

molecular characteristics not reflected by pharmacophore

models (physico-chemical, ADME and toxicological proper-

ties) must be taken into account when deciding upon which

compounds should be further developed.

References1 Oprea, T.I. (2002) Current trends in lead discovery: are we looking for the

appropriate properties? J. Comput. Aided Mol. Des. 16, 325–334

2 Hoffmann, R.D. et al. (2004) Use of 3D pharmacophore searching. In

Computational Medicinal Chemistry and Drug Discovery (Tollenaere,

J., De Winter, H., Langenaeker, W., Bultinck, P. eds), pp. 461–482,

Dekker Inc

3 Wermuth, C.-G. and Langer, T. (1993) Pharmacophore identification. In

3D-QSAR in Drug Design. Theory, Methods, and Applications (Kubinyi,

H., ed.), pp. 117–136, ESCOM Science Publishers

4 Kurogi, Y. and Guner, O.F. (2001) Pharmacophore modeling and three-

dimensional database searching for drug design using catalyst. Curr. Med.

Chem. 8, 1035–1055

5 Langer, T. and Krovat, E-M. (2003) Chemical feature-based pharmaco-

phores and virtual library screening for discovery of new leads. Curr. Opin.

Drug Discov. Dev. 6, 370–376

6 Wolber, G. and Langer, T. (2004) LigandScout: 3D Pharmacophores

derived from protein-bound ligands and their use as virtual screening

filters. J. Chem. Inf. Comput. Sci. Webrelease 24 Nov. 2004, doi:10.1021/

ci049885e

7 Berman, H. et al. (2000) The protein data bank. Nucleic Acids Res. 28, 235–

242

8 Cerius2 available from Accelrys Inc, San Diego, CA, USA

9 Bohm, H-J. (1992) The computer program LUDI: a new method for the

de novo design of enzyme inhibitors. J. Comput. Aided Mol. Des. 6,

61–78

10 Unity/Sybyl available from Tripos Inc., St. Louis, MO, USA

11 MOE available from Chemical Computing Group Inc., Quebec, Canada

12 Catalyst available from Accelrys Inc, San Diego, CA, USA

13 DiscoTech available from Tripos Inc., St. Louis, MO, USA

14 Gasp available from Tripos Inc., St. Louis, MO, USA

15 Martin, Y.C. et al. (1993) A fast new approach to pharmacophore mapping

and its application to dopaminergic and benzodiazepine agonists. J.

Comput. Aided Mol. Des. 7, 83–102

16 Marshall, G.R. et al. (1979) The conformational parameter in drug design:

the active analogue approach. In Computer-Assisted Drug Design, (Vol.

112) (Olson, E.C., Christoffersen, R.E. eds), pp. 205–226, American

Chemical Society

17 Patel, Y. et al. (2002) A comparison of the pharmacophore identification

programs: Catalyst, DISCO and GASP. J. Comput. Aided Mol. Des. 16,

653–681

18 C@rol available from Molecular Networks GmbH, Erlangen, Germany

19 Feature Trees available from BioSolveIT GmbH, Sankt Augustin, Ger-

many

20 Van Drie, J.H. (2003) Pharmcophore discovery – lessons learned, Curr.

Pharm. Des. 9, 1649–1664


pharmacophore definition and 3d searches

Documents