advanced bioinformatics lecture 7: computer-aided lead identification
DESCRIPTION
Advanced Bioinformatics Lecture 7: Computer-aided lead identification. ZHU FENG [email protected] http://idrb.cqu.edu.cn/ Innovative Drug Research Centre in CQU. 创新药物研究与生物信息学实验室. Table of Content. Schematic of DOCKing Pharmacophore-based docking INVDOCK Strategy - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/1.jpg)
Advanced BioinformaticsLecture 7: Computer-aided lead identification
http://idrb.cqu.edu.cn/Innovative Drug Research Centre in CQU
创新药物研究与生物信息学实验室
![Page 2: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/2.jpg)
1. Schematic of DOCKing
2. Pharmacophore-based docking
3. INVDOCK Strategy
4. Ligand-based drug design
5. Classification of drugs by SVM
Table of Content
2
![Page 3: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/3.jpg)
Given two molecules find their correct association
What is docking?
3
+ =Recep
tor
Ligand
T
Complex
Computationally predict the structures of protein-ligand complexes from
their conformations and orientations. The orientation that maximizes the
interaction reveals the most accurate structure of the complex.
![Page 4: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/4.jpg)
Ligand
−Molecule that binds
with a protein
Protein active
site(s)
−Allosteric binding
−Competitive binding
Function of
binding interaction
−Natural and artificial
General protein–ligand binding
4
![Page 5: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/5.jpg)
Docking strategy
5
PDB file
Surface Representation
Patch Detection
Matching Patches
Scoring & Filtering
Candidatecomplexes
![Page 6: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/6.jpg)
Schematic of docking methodology
6
(A) the target binding site is
filled with site points
(B) distances between atoms in
a molecule are matched to
that of site points
(C) a transformation matrix is
calculated for an orientation
(D) the molecule is docked into
the binding site, and the fit
of that conformer is scored
![Page 7: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/7.jpg)
Design of HIV-1 protease inhibitorStep 1: creation of spheres to fit a cavity
7
![Page 8: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/8.jpg)
Design of HIV-1 protease inhibitorStep 2: place a ligand to match the position of spheres
8
![Page 9: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/9.jpg)
Design of HIV-1 protease inhibitorStep 3: check chemical complementarity
9
![Page 10: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/10.jpg)
Scoring in ligand-protein dockingPotential energy description
10
![Page 11: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/11.jpg)
Surface representation, that efficiently represents the docking surface and identifies the regions of interest
− Connolly surface
− Lenhoff technique etc.
Some techniques
Dense MS surface (Connolly) Sparse surface (Shuo Lin et al.)
11
![Page 12: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/12.jpg)
Each atomic sphere is given the van der Waals radius of the atom
Rolling a Probe Sphere over the Van der Waals surface leads to the Solvent Reentrant Surface or Connolly surface
Connolly surface
12
![Page 13: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/13.jpg)
Computes a “complementary” surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand
Lenhoff technique
13
Atom centers of the ligand
van der Waals surface
![Page 14: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/14.jpg)
Pharmacophore-based dockingBasic idea
14
Appropriate spatial disposition of a small
number of functional groups in a molecule is
sufficient for achieving a desired biological
effect.
The ensemble formation will be guided by
these functional groups
![Page 15: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/15.jpg)
5.2
4.2-4.7
6.7
4.8
5.1-7.1
3-D representation of a protein binding site
15
Distances betweenbinding groupsin Angstroms and the type of interactionis searchable
![Page 16: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/16.jpg)
Pharmacophore Fingerprint
16
Appropriate spatial disposition of a small
number of functional groups in a molecule is
sufficient for achieving a desired biological
effect.
The ensemble formation will be guided by
these functional groups
![Page 17: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/17.jpg)
Schematic of PhDOCK methodology
17DOCK PhDOCK
![Page 18: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/18.jpg)
Advantages and disadvantages of PhDOCK
18
Advantages: speed increase due to (1) rapid elimination of
ligands containing functional groups which would interfere
with binding. (2) speed increase over docking of individual
molecules. (3) more information pertaining to the entire
molecule is retained (no rigid portions). (4) Chemical matching
and critical clusters are encouraged.
Disadvantages: (1) complex queries are extremely slow. (2) the
majority of the information contained in the target structure is
not considered during the search.
![Page 19: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/19.jpg)
19
Existing methods
Given a protein, find putative
binding ligands from chemical
database
Given Lock, find Key
Forward lead identification
Science 1992; 257:1078
INVDOCK methods
Given a ligand, find putative
protein targets from protein
database
Given Key, find Lock
Backward MOA prediction
Proteins 1999; 36:1
INVDOCK Strategy
![Page 20: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/20.jpg)
INVDOCK Test on Drug Target Prediction Anticancer Drug Tamoxifen
20
PDB Id Protein Experimental Findings1a25 Protein Kinase C Secondary Target1a52 Estrogen Receptor Drug Target1bhs 17 beta HSD dehydragenase Inhibitor1bld bFGF Factor Inhibitor1cpt Cytochrome P450-TERP Metabolism1dmo Calmodulin Secondary Target
Proteins. 1999; 36:1
Tamoxifen is a famous anticancer
drug for treatment of breast cancer.
It was approved by FDA in 1998 as
the 1st cancer preventive drug. 30
million people are expected to use it.
![Page 21: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/21.jpg)
Compound
Number of experimentally confirmed or implicated toxicity targets
Number of toxicity targets predicted by INVDOCK
Number of toxicity targets missed by INVDOCK
Number of toxicity targets without structure or involving covalent bond
No. of INVDOCK predicted toxicity targets without experimental finding
Aspirin 15 9 2 4 2
Gentamicin 17 5 2 10 2
Ibuprofen 5 3 0 2 2
Indinavir 6 4 0 2 2
Neomycin 14 7 1 6 6
Penicillin G 7 6 0 1 8
Tamoxifen 2 2 0 0 4
Vitamin C 2 2 0 0 3
Total 68 38 5 25 29
INVDOCK Test on Drug Target Prediction Drug Toxicity Targets (J. Mol. Graph. Mod. 2001, 20, 199)
21
![Page 22: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/22.jpg)
The docked (blue) and crystal (yellow) structure of ligands in some PDB ligand-protein complexes. The PDB Id of each structure is shown. 22
Results of docking studies
![Page 23: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/23.jpg)
Protein-Protein cases from protein-protein docking benchmark:Enzyme-inhibitor – 22 cases Antibody-antigen – 16 cases
Protein-DNA docking: 2 unbound-bound cases
Protein-drug docking: tens of bound cases (Estrogen receptor, HIV protease, COX)
Performance: Several minutes for large protein molecules and seconds for small drug molecules on standard PC computer.
Dataset and Testing Results
Endonuclease I-PpoI (1EVX) with DNA (1A73). RMSD 0.87Å, rank 2
DNA
Endonuclease
Docking solution
Estrogen receptor
Estradiol molecule from complex
Docking solution
Estrogen receptor with estradiol (1A52). RMSD 0.9Å, rank 1, running time: 11 seconds 23
![Page 24: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/24.jpg)
A drug is classified as either belong (+) or not belong (-) to a class
Drug class: inhibitor of a protein, BBB penetrating, genotoxic, etc.
Protein class: enzyme EC3.4 family, DNA-binding, etc.
By screening against all classes, the property of a drug or the function of a protein can be identified
Drug
Class-1 SVM
Class-2 SVM
……
-
+
Classification of Drugs by SVM
Class-n SVM -
-
Drug belongsto class-2
24
![Page 25: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/25.jpg)
What is SVM?
• Support vector machines, a machine learning method based on
artificial intelligence, learning by examples, statistical learning,
classify objects into one of the two classes.
Advantages of SVM:
• Diversity of class members (no racial discrimination).
• Use of structure-derived physico-chemical features as basis for
drug classification (no structure-similarity required in the
algorithm).
Classification of drugs by SVM
25
![Page 26: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/26.jpg)
Artificial Intelligence (AI)
26
![Page 27: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/27.jpg)
Inductive learning (example-based learning)
Machine learning method
27
![Page 28: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/28.jpg)
A = (1, 1, 1)B = (0, 1, 1)C = (1, 1, 1)D = (0, 1, 1)E = (0, 0, 0)F = (1, 0, 1)
Machine learning methodFeature vectors
28
![Page 29: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/29.jpg)
A=(1, 1, 1)B=(0, 1, 1)C=(1, 1, 1)D=(0, 1, 1)E=(0, 0, 0)F=(1, 0, 1)
Z
Input space
X
Y
BAE
F
Feature vector
Machine learning methodFeature vectors in input space
29
![Page 30: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/30.jpg)
SVM Method
BorderNew border
Project to a higher dimensional space
Drug familymembers
Nonmembers
Drug familymembers
Nonmembers
30
![Page 31: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/31.jpg)
Support vector
Support vector
New border
Protein familymembers
Nonmembers
SVM Method
31
![Page 32: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/32.jpg)
Protein familymembers
Nonmembers
New border
Support vector
Support vector
SVM Method
32
![Page 33: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/33.jpg)
Best Linear Separator?
33
![Page 34: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/34.jpg)
c
d
Find closest points in convex hulls
34
![Page 35: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/35.jpg)
c
d
Plane bisect closest points
35
![Page 36: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/36.jpg)
Maximize distanceBetween two parallel supporting planes
Distance = “Margin” =
36
Best Linear SeparatorSupporting plane method
![Page 37: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/37.jpg)
Best Linear SeparatorSupporting plane method
37
![Page 38: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/38.jpg)
Border line is nonlinear
38
SVM Method
![Page 39: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/39.jpg)
Non-linear transformation: use of kernel function
39
SVM Method
![Page 40: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/40.jpg)
40
SVM Method
![Page 41: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/41.jpg)
41
SVM Method
![Page 42: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/42.jpg)
42
SVM Method
![Page 43: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/43.jpg)
43
SVM Method
![Page 44: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/44.jpg)
44
SVM Method
![Page 45: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/45.jpg)
SVM for classification of drugs
How to represent a drug?• Each structure represented by specific feature vector
assembled from structural, physico-chemical properties Simple molecular properties (molecular weight, no. of
rotatable bonds etc. 18 in total) Molecular Connectivity and shape (28 in total) Electro-topological state polarity (84 in total) Quantum chemical properties (electric charge,
polaritability etc. 13 in total) Geometrical properties (molecular size vector, van
der Waals volume, molecular surface etc. 16 in total)
J. Chem. Inf. Comput. Sci. 44,1630 (2004) J. Chem. Inf. Comput. Sci. 44, 1497 (2004)
Toxicol. Sci. 79,170 (2004)
45
![Page 46: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/46.jpg)
Computer loaded with SVMProt
SVMclassifier for every
Drug class
Identified classes
Drug designed or property predicted
Send structure to classifier
Input structurethrough internet
Option two
Option one
Input structureon local machine
Your drug structure
Which class your drug belongs to?
DrugChemical Structure
Chemical Structure
46
SVM-based drug design and property prediction software
![Page 47: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/47.jpg)
Protein inhibitor/activator/substrate prediction• 86% of the 129 estrogen receptor activators and 84% of 101 non-
activators correctly predicted.
• 81% of 116 P-glycoprotein substrates and 79% of 85 non-substrates correctly predicted
Drug toxicity prediction• 97% of 102 TdP+ and 84% of 243 TdP- agents correctly predicted
• 73% of 229 genotoxic and 93% of 631 non-genotoxic agents correctly predicted
Pharmacokinetics prediction• 95% of 276 BBB+ and 82% of 139 BBB- agents correctly predicted
• 90% of 131 human intestine absorption and 80% of 65 non-absoption agents correctly predicted.
47
SVM drug prediction results
![Page 48: Advanced Bioinformatics Lecture 7: Computer-aided lead identification](https://reader035.vdocuments.site/reader035/viewer/2022062519/56815027550346895dbe136e/html5/thumbnails/48.jpg)
Projects Q&A!
1. Biological pathway simulation
2. Computer-aided anti-cancer drug design
3. Disease-causing mutation on drug target
48
Any questions? Thank you!