bioinformatics iv quantitative structure-activity relationships (qsar) and comparative molecular...
Post on 21-Dec-2015
232 views
TRANSCRIPT
![Page 1: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/1.jpg)
Bioinformatics IV
Quantitative Structure-Activity Relationships (QSAR)
and
Comparative Molecular Field Analysis (CoMFA)
Martin Ott
![Page 2: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/2.jpg)
Outline
• Introduction
• Structures and activities
• Regression techniques: PCA, PLS
• Analysis techniques: Free-Wilson, Hansch
• Comparative Molecular Field Analysis
![Page 3: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/3.jpg)
QSAR: The Setting
Quantitative structure-activity relationships
are used
when there is little or no receptor information,
but
there are measured activities of (many)
compounds
They are also useful to supplement docking
studies which take much more CPU time
![Page 4: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/4.jpg)
From Structure to Property
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
0
1
2
3
4
5
6
7
8
9
1 3 5 7 9 11 13 15
EC5
0
![Page 5: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/5.jpg)
From Structure to Property
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
LD50
![Page 6: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/6.jpg)
From Structure to Property
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
O
H
H H
OH
![Page 7: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/7.jpg)
QSAR: Which Relationship?
Quantitative structure-activity
relationships
correlate chemical/biological activities
with structural features or atomic, group
or
molecular properties
within a range of structurally similar compounds
![Page 8: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/8.jpg)
Free Energy of Binding andEquilibrium Constants
The free energy of binding is related to the reaction constants of ligand-receptor complex formation:
Gbinding = –2.303 RT log K
= –2.303 RT log (kon / koff)
Equilibrium constant K
Rate constants kon (association) and koff (dissociation)
![Page 9: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/9.jpg)
Concentration as Activity Measure
• A critical molar concentration Cthat produces the biological effectis related to the equilibrium constant K
• Usually log (1/C) is used (c.f. pH)
• For meaningful QSARs, activities needto be spread out over at least 3 log units
![Page 10: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/10.jpg)
Molecules Are Not Numbers!
O
N
CH3
OH
H
HOH-1.09.109*10-31
2.99792*108
-½
0 -0.3183
-180.156
196.967
149,597,870,691
e
43
7
Where are the numbers? Numerical descriptors
![Page 11: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/11.jpg)
An Example: Capsaicin Analogs
X EC50(M) log(1/EC50)
H 11.80 4.93
Cl 1.24 5.91
NO2 4.58 5.34
CN 26.50 4.58
C6H5 0.24 6.62
NMe2 4.39 5.36
I 0.35 6.46
NHCHO ? ?
X
NH
O
OH
MeO
![Page 12: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/12.jpg)
An Example: Capsaicin Analogs
X log(1/EC50) MR Es
H 4.93 1.03 0.00 0.00 0.00
Cl 5.91 6.03 0.71 0.23 -0.97
NO2 5.34 7.36 -0.28 0.78 -2.52
CN 4.58 6.33 -0.57 0.66 -0.51
C6H5 6.62 25.36 1.96 -0.01 -3.82
NMe2 5.36 15.55 0.18 -0.83 -2.90
I 6.46 13.94 1.12 0.18 -1.40
NHCHO ? 10.31 -0.98 0.00 -0.98
MR = molar refractivity (polarizability) parameter; = hydrophobicity parameter;
= electronic sigma constant (para position); Es = Taft size parameter
![Page 13: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/13.jpg)
An Example: Capsaicin Analogs
X
NH
O
OH
MeO
log(1/EC50) = -0.89 + 0.019 *
MR + 0.23 * + -0.31 * +
-0.14 * Es
![Page 14: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/14.jpg)
Basic Assumption in QSAR
The structural properties of a compound
contribute
in a linearly additive way to its biological
activity
provided there are no non-linear dependencies of
transport or binding on some properties
![Page 15: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/15.jpg)
Molecular Descriptors
• Simple counts of features, e.g. of atoms, rings,H-bond donors, molecular weight
• Physicochemical properties, e.g. polarisability, hydrophobicity (logP), water-solubility
• Group properties, e.g. Hammett and Taft constants, volume
• 2D Fingerprints based on fragments
• 3D Screens based on fragments
![Page 16: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/16.jpg)
2D Fingerprints
Br
NH
O
OH
MeO
C N O P S X F Cl Br I Ph CO NH OH Me Et Py CHO SO C=C CΞC C=N Am Im
1 1 1 0 0 1 0 0 1 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0
![Page 17: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/17.jpg)
Principal Component Analysis (PCA)
• Many (>3) variables to describe objects= high dimensionality of descriptor data
• PCA is used to reduce dimensionality
• PCA extracts the most important factors (principal components or PCs) from the data
• Useful when correlations exist between descriptors
• The result is a new, small set of variables (PCs) which explain most of the data variation
![Page 18: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/18.jpg)
PCA – From 2D to 1D
![Page 19: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/19.jpg)
PCA – From 3D to 3D-
![Page 20: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/20.jpg)
Different Views on PCA
• Statistically, PCA is a multivariate analysis technique closely related to eigenvector analysis
• In matrix terms, PCA is a decomposition of matrix Xinto two smaller matrices plus a set of residuals: X = TPT + R
• Geometrically, PCA is a projection technique in which X is projected onto a subspace of reduced dimensions
![Page 21: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/21.jpg)
Partial Least Squares (PLS)
y1 = a0 + a1x11 + a2x12 + a3x13 + … + e1
y2 = a0 + a1x21 + a2x22 + a3x23 + … + e2
y3 = a0 + a1x31 + a2x32 + a3x33 + … + e3
…
yn = a0 + a1xn1 + a2xn2 + a3xn3 + … + en
Y = XA + E
(compound 1)
(compound 2)
(compound 3)
…
(compound n)
X = independent variables
Y = dependent variables
![Page 22: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/22.jpg)
PLS – Cross-validation
• Squared correlation coefficient R2
• Value between 0 and 1 (> 0.9)
• Indicating explanative power of regression equation
• Squared correlation coefficient Q2
• Value between 0 and 1 (> 0.5)
• Indicating predictive power of regression equation
With cross-validation:
![Page 23: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/23.jpg)
Free-Wilson Analysis
log (1/C) = aixi + xi: presence of group i (0 or 1)
ai: activity group contribution of group i
: activity value of unsubstituted compound
![Page 24: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/24.jpg)
Free-Wilson Analysis
+ Computationally straightforward
– Predictions only for substituents already included
– Requires large number of compounds
![Page 25: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/25.jpg)
Hansch Analysis
Drug transport and binding affinity
depend nonlinearly on lipophilicity:
log (1/C) = a (log P)2 + b log P + c + k
P: n-octanol/water partition coefficient
: Hammett electronic parameter
a,b,c: regression coefficients
k: constant term
![Page 26: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/26.jpg)
Hansch Analysis
+ Fewer regression coefficients needed for correlation
+ Interpretation in physicochemical terms
+ Predictions for other substituents possible
![Page 27: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/27.jpg)
Pharmacophore
• Set of structural features in a drug molecule recognized by a receptor
• Sample features:
H-bond donor
charge
hydrophobic center
• Distances, 3D relationship
![Page 28: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/28.jpg)
Pharmacophore Selection
L = lipophilic site; A = H-bond acceptor;D = H-bond donor; PD = protonated H-bond donor
DopaminePharmacophore
L
PD
D
d1
d2 d3
L
PD
D
d1
d2 d3L
PD
D
d1
d2 d3
NH+
CO2H
CH3H
NH
NH+H
CH3
OH
OH
OH
OH
NH3+
![Page 29: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/29.jpg)
OH
NH3+
OH
NH+H
CH3
OH
OH
Pharmacophore Selection
L = lipophilic site; A = H-bond acceptor;D = H-bond donor; PD = protonated H-bond donor
DopaminePharmacophore
L
PD
D
d1
d2 d3
L
PD
D
d1
d2 d3L
PD
D
d1
d2 d3
NH+
CO2H
CH3H
NH
L
PD
D
d1
d2 d3
![Page 30: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/30.jpg)
Comparative Molecular Field Analysis (CoMFA)
• Set of chemically related compounds
• Common pharmacophore or
substructure required
• 3D structures needed (e.g., Corina-
generated)
• Flexible molecules are “folded” into
pharmacophore constraints and aligned
![Page 31: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/31.jpg)
CoMFA Alignment
C7OH
OH
A
D
B
C1
MeO OMe
ClClCl
BA
O
OC7OH
OHOH
A
B
C1
O
NMe2
OH
A B
CL
LL d1
d2d3L
LL
d1
d2
d3
L
LL
d1
d2
d3
L
L
L
d1 d2
d3
L
LL
d1
d2
d3
"Pharmacophore"
![Page 32: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/32.jpg)
CoMFA Grid and Field Probe
(Only one molecule shown for clarity)
![Page 33: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/33.jpg)
Electrostatic Potential Contour Lines
![Page 34: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/34.jpg)
CoMFA Model Derivation
Van der Waals field(probe is neutral carbon)
Evdw = (Airij-12 - Birij
-6)
Electrostatic field(probe is charged atom)
Ec = qiqj / Drij
• Molecules are positioned in a regular grid
according to alignment
• Probes are used to determine the molecular
field:
![Page 35: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/35.jpg)
3D Contour Map for Electronegativity
![Page 36: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott](https://reader030.vdocuments.site/reader030/viewer/2022033022/56649d565503460f94a33db5/html5/thumbnails/36.jpg)
CoMFA Pros and Cons
+ Suitable to describe receptor-ligand interactions
+ 3D visualization of important features
+ Good correlation within related set
+ Predictive power within scanned space
– Alignment is often difficult
– Training required