![Page 1: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/1.jpg)
Interaction fingerprints
Vladimir Chupakhin, UNISTRA, 2011
1NTERACT10NF1NGERPR1NTS
1
Chupakhin VladimirLaboratory of ChemoinformaticsStructural Chemogenomics GroupUniversity of Strasbourg
December 2011
![Page 2: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/2.jpg)
Virtual screening approaches
Ligand –based (QSAR, similarity search,
pharmacophores)
Vladimir Chupakhin, UNISTRA, 2011
Structure–based(docking, pharmacophores)
?
![Page 3: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/3.jpg)
Lock-and-key paradigm
InteractionsLo
ckK
ey
Vladimir Chupakhin, UNISTRA, 20113
![Page 4: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/4.jpg)
Molecular docking: main steps
1. Protein and ligand preparation2. Binding site identification
3. Conformational search with scoring of the generated
poses
Vladimir Chupakhin, UNISTRA, 20114
![Page 5: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/5.jpg)
Geometry of interaction
H-bond length (3.0 Å)
H-bond angle (~175°)
Interactions are geometry!
![Page 6: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/6.jpg)
- Hydrophobic- H-bonds- Ionic- Aromatic- Cation-π
Different type of interactions
![Page 7: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/7.jpg)
Self-docking
Vladimir Chupakhin, UNISTRA, 20117
Extract ligand
Modify geometryDock to thesame protein
Extract ligand
Calculate RMSD
BlueRedOrange
1.1Å4.3Å
![Page 8: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/8.jpg)
Docking quality: RMSD
Vladimir Chupakhin, UNISTRA, 20118
δ is the distance between N pairs of equivalent atoms
δ1
δN
![Page 9: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/9.jpg)
Cross-docking
Vladimir Chupakhin, UNISTRA, 20119
Procedures are the same. But why?Robustness!!!
These fluctuation have huge influence in the docking results
![Page 10: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/10.jpg)
Scoring functions
Vladimir Chupakhin, UNISTRA, 201110
1. Force-field scoring functions (Dock, AutoDock, GOLD)
2. Empirical scoring functions (ChemScore, PLP, Glide SP/XP)
3. Knowledge-based scoring functions (PMF, DrugScore, ASP, SMoG)
Ligand atoms
Protein atoms
![Page 11: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/11.jpg)
Force-field scoring function
Vladimir Chupakhin, UNISTRA, 2011DOI:10.1038/nrd1549
Protein-ligand interactions energy terms Ligand energy terms
Algorithm (force field based)For a given PL complex1. Calculate the interaction energies
between atoms of the ligand and protein (EvdW + EH-bond) using force field.
2. Calculate internal energy of the ligand (Ewdw + Etorsion) + internal H-bond of the ligand (optionally).
3. Total energy = sum of the energy terms 2 and 3
11
![Page 12: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/12.jpg)
Empirical scoring function
Vladimir Chupakhin, UNISTRA, 2011
Algorithm (additive scheme)1. Define interactions types and
geometries2. Look up at the database of
interaction energies3. Total energy = Sum of the
contribution of the every component (+ geometry term influence)
LUDI
DOI:10.1038/nrd1549
ESF made to reproduce the binding energies or conformations (scoring function depends on the training set used to developed it)
12
![Page 13: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/13.jpg)
Knowledge-based scoring function
Vladimir Chupakhin, UNISTRA, 2011DOI:10.1038/nrd1549
Algorithm1. Define interactions types and
geometries2. Look up into the database of LP
atom interactions3. Total score (energy) = Sum of the
interactions scores (energies)(ϒ – adjustable parameter, SAS0 – solvated state
of the solvent accessible ares)
KBSF developed to reproduce the binding pose then energy
13
![Page 14: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/14.jpg)
Scoring functions: the purposes
Vladimir Chupakhin, UNISTRA, 201114
Docking = finding the correct binding
pose
Scoring = predict activity of the compound (Ki,
IC50, etc)
![Page 15: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/15.jpg)
Scoring functions: docking
Vladimir Chupakhin, UNISTRA, 2011
Docking
Average success to dock compound within RMSD < 2Å is around 70%
15Comparative Assessment of Scoring Functions on a Diverse Test Set, Wang, 2009
![Page 16: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/16.jpg)
Scoring functions: scoring
Vladimir Chupakhin, UNISTRA, 2011
Scoring
Average success rate to rank compound with correlation coefficient from 55-64%
Comparative Assessment of Scoring Functions on a Diverse Test Set, Wang, 2009 16
![Page 17: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/17.jpg)
GOLD Score failure
Vladimir Chupakhin, UNISTRA, 201117
pose1
pose2
pose1 pose2GOLD Score 59,19 59,30
RMSD, Å 1,10 4,27
Top scored pose
![Page 18: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/18.jpg)
Molecular scoring functions: problems
Vladimir Chupakhin, UNISTRA, 2011
1.Problems when binding site is highly charged or highly hydrophobic/ hydrophilic
2.Problems when binging site contains waters, ions, cofactors
3. Fragment-like docking – is very tricky4. Even input conformation can influence
the docking results
18
![Page 19: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/19.jpg)
Vladimir Chupakhin, UNISTRA, 2011
Interaction fingerprints
19
![Page 20: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/20.jpg)
Chemical fingerprintFingerprints encode the presence or absence of certain features in a
compound, e.g., fragments.
0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0
KISS: Keep It Short and Simple! Keep It Simple Stupid
![Page 21: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/21.jpg)
Structural Interaction Fingerprints
Zhan Deng, Claudio Chuaqui, and Juswinder Singh Structural Interaction Fingerprint (SIFt): A Novel Method for Analyzing Three-Dimensional Protein−Ligand Binding Interactions (DOI: 10.1021/jm030331x), Biogen Inc.
Detect interactions of the ligandwith every amino acidof the binding site
![Page 22: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/22.jpg)
22
Aromatic face to face
Hydrophobic
H-bond(protein donor)
Ionic (protein anion)
Aromatic face to edge
Ionic (protein cation)
1 0 0 0 1 0 0
H-bond(protein acceptor)
Bitstring for 1 residue
Bitstring for the whole binding site – Interaction Fingerprint
100100010000101000000100000010000001 …..Residue 1 Residue 2 Residue 3 Residue 4 Residue 5 Residue X
Interaction Fingerprints : preparation
2007, Optimizing Fragment and Scaffold Docking by Use of Molecular Interaction Fingerprints
![Page 23: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/23.jpg)
Molecular Interaction Fingerprints ~ (IFP)
Vladimir Chupakhin, UNISTRA, 2011
ILE10 1000000VAL18 1000000ALA31 1000000LYS33 1000000VAL64 1000000PHE80 1010000GLU81 0000100PHE82 1100000LEU83 1001000HIS84 1000000GLN85 1000000ASP86 1000101LEU134 1000000ALA144 1000000ASP145 1000000
Zhan Deng, Claudio Chuaqui, and Juswinder Singh Structural Interaction Fingerprint (SIFt): A Novel Method for Analyzing Three-Dimensional Protein−Ligand Binding Interactions (DOI: 10.1021/jm030331x), Biogen Inc.
1000000100000010000001000000100000010000001000101100000010000001000000
3D 1D (bit string)
23
![Page 24: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/24.jpg)
interacting patterns (amino acid can be
represented as residue or an
pharmacophoric point, interacting
fragment of ligand can be encoded as
atom, fragment or pharmacophoric point);
type of interaction (hydrogen bonds,
hydrophobic interactions, etc);
direction of interaction (this parameter
distinguish the direction of interaction: for
example is donor of hydrogen bond protein
or ligand);
strength of interaction and distance
between interacting patterns (these
parameters are research specific);
number of bits per interaction point (one
or many).
Ligand ↔ Receptor
Parameters of IFP
24
![Page 25: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/25.jpg)
Gold scoring function failure: IFP wins!
Vladimir Chupakhin, UNISTRA, 201125
Ligand A07 from LR-complex (PDB ID: 3LFS), docked into CDK2 binding site (PDB ID: 2A0C).
Pose 1 – orange(TCreal_vs_docked – 0.75RMSD – 1.10 Å,Goldscore = 59.20)
Pose 2 – blue(TCreal_vs_docked – 0.52RMSD – 4.27 Å,Goldscore = 59.30)
X-ray pose – brickred
Jaccard (Tanimoto)coefficient
![Page 26: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/26.jpg)
Vladimir Chupakhin, UNISTRA, 2011
IFP usage• store interactions in useful format• analyze experimental LR-complexes
• quality of docking studies• results clustering (even peptides and PPI)
• analyze docked LR-complexes (drug-like and fragment-like compounds)
• retrieve correct binding pose• retrieve specific binding pose
![Page 27: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/27.jpg)
Use cases for IFP: storage
Useful way to store interaction information from experimentally derived LR-complexes:
• scPDB database – Laboratory of Didier Rognan, UNISTRA, Illkirch (DOI: 10.1021/ci050372x)
• CREDO database (DOI:10.1111/j.1747-0285.2008.00762.x).
27
![Page 28: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/28.jpg)
Use cases for IFP: x-ray LR analysis
Vladimir Chupakhin, UNISTRA, 2011
Compounds
Binding site
Specific interactions
DOI: 10.1021/jm030331x
![Page 29: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/29.jpg)
Use cases for IFP: pose retrieval (1)
Vladimir Chupakhin, UNISTRA, 2011
RMSD is not 100% correct evaluation function!
DOI: 10.1021/ci600342e
![Page 30: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/30.jpg)
Use cases for IFP: VS
Vladimir Chupakhin, UNISTRA, 2011
Compare the reference x-ray IFP with IFP of docked poses using Tanimoto coefficient.Compounds
database
Virtual screeningresults
Using standard SF: X% of the real hitsUsing standard SF + TC: X% + up to 20%
![Page 31: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/31.jpg)
Use cases for IFP: PPI
Vladimir Chupakhin, UNISTRA, 2011
IFP suitable even for analysis of Protein-Protein Interactions!
![Page 32: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/32.jpg)
Use cases for IFP: agonists/antagonists
Vladimir Chupakhin, UNISTRA, 2011
(A) Procaterol – agonist, (B) Carvediol - antagonist
Selective Structure-Based Virtual Screening for Full and Partial Agonists of the b2 Adrenergic Receptor, DOI: 10.1021/jm800710x
![Page 33: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/33.jpg)
IFP modifications
IFP modifications
![Page 34: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/34.jpg)
IFP modifications: r-SIFt – R-group IFP
LEU831001000
110
C R1R2 Benefits: Combinatorial library analysis(~100.000 compounds)
Independent of interaction type!Just the fact of interaction!
DOI: 10.1021/jm050381x
![Page 35: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/35.jpg)
IFP modifications: w-SIFt – weighed IFP
+ Biological Activity
+Machine learning approach: find correlation between bit frequency and activity
DOI: 10.1021/ci800466n
moderateactivity
mostactive
lessactive
Benefits:• help to find what interactions are critical for compound potency• interpretable position dependent scoring function for ligand protein interactions
![Page 36: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/36.jpg)
Binding site independent IFP
Binding site independent IFP
![Page 37: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/37.jpg)
BS-independent IFP: APIF
APIF: A New Interaction Fingerprint Based on Atom Pairs and Its Application to Virtual Screening
Atom Pair
Distance = range
Algorithm1. Detect interaction patterns (Hydrophobic,
HBA, HBD)2. Define distance1 and distance2 for
quadruplet interaction3. Convert distances to distance range4. Map distance range and types ….
QuadrupletIFP
![Page 38: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/38.jpg)
BS-independent IFP: APIF - Quadruplet
Ligand-atom
Protein-atom
Ligand-atom
Protein-atom
Distance 2
Distance 1
Interaction Interaction
1 bit in the APIF
![Page 39: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/39.jpg)
BS-independent IFP: APIF
Benefits:• independent on the binding site• comparable to current scoring functions
![Page 40: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/40.jpg)
BS-independent IFP: Pharm-IF
Algorithm1. Detect interaction patterns (Hydrophobic,
HBA, HBD)2. Define ligand pairs based on ligand atoms
interacting with protein ONLY3. Measure their distance4. Map distance to range (quantization) =
Pharm-IF
Benefits:• independent on the binding site• comparable to current scoring functions
DOI: 10.1021/ci900382e
![Page 41: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/41.jpg)
IFP-based scoring functions
IFP-based scoring functions
![Page 42: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/42.jpg)
IFP-based SF: AuPosSOM
Vladimir Chupakhin, UNISTRA, 201142
• Dock decoys and compounds with known activity
• Generate vector of interactions (H-bons, hydroph.interactions)
• Train model of the active and incative (vector is input)*
Automatic clustering of docking poses in virtual screening process using self-organizing map - AuPosSOM
f (Input (IFP) = 1 or 0 where 1 – is binder0 – non binder
*Simplified representation
![Page 43: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/43.jpg)
IFP-based SF: RF-Score
Vladimir Chupakhin, UNISTRA, 201143
A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking – RF-Score DOI:10.1093/bioinformatics/btq112
• Vector of 36 features, each feature is occurrence count for j-iatom pair
• Mechanism of generations: take all atoms around 12A around selected ligand atom, filter out interaction out of cutoff range, sum the result (for each interaction pair).
• PDBBind was used to train Random Forest model• Train model using activity as output and interactions as input
![Page 44: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/44.jpg)
Literature overview: SVM-SP
Vladimir Chupakhin, UNISTRA, 201144
Support Vector Regression Scoring of Receptor–Ligand Complexes for Rank-Ordering and Virtual Screening of Chemical Libraries DOI: 10.1021/ci200078f
• Two types of vectors: SVR-KB (146 features) are knowledge-based pairwise potentials (same as above mentioned but trained with SVR), while SVR-EP is based on physico-chemical properties. SVR-EP vector consist of features extracted from X-score (polar/unpolarSASA, MW, vdW energy, etc)
• SVR-KB is better then SVR-EP
Vector is unique!Vector is atom pair based
![Page 45: Interaction fingerprint: 1D representation of 3D protein-ligand complexes](https://reader034.vdocuments.site/reader034/viewer/2022052210/5557208ed8b42a320c8b477b/html5/thumbnails/45.jpg)
Merci bien!Thanks a lot!