a network visualization of structure activity landscapes
TRANSCRIPT
![Page 1: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/1.jpg)
A Network Visualization of Structure Activity Landscapes
Rajarshi GuhaNIH Chemical Genomics Center
March 24, 2010National ACS Meeting, San Francisco
![Page 2: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/2.jpg)
Structure Activity Relationships
• Similar molecules will have similar activities• Small changes in structure will lead to small
changes in activity• One implication is that SAR’s are additive• This is the basis for QSAR modeling
Martin, Y.C. et al., J. Med. Chem., 2002, 45, 4350–4358
![Page 3: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/3.jpg)
Exceptions Are Easy to Find
Tran, J.A. et al., Bioorg. Med. Chem. Lett., 2007, 15, 5166–5176
Ki = 39.0 nM Ki = 1.8 nM
Ki = 10.0 nM Ki = 1.0 nM
![Page 4: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/4.jpg)
Structure Activity Landscapes
• Rugged gorges or rolling hills?– Small structural changes associated with large
activity changes represent steep slopes in the landscape
– But traditionally, QSAR assumes gentle slopes – Machine learning is not very good for special cases
Maggiora, G.M., J. Chem. Inf. Model., 2006, 46, 1535–1535
![Page 5: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/5.jpg)
Structure Activity Landscapes
![Page 6: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/6.jpg)
Characterizing the Landscape
• A cliff can be numerically characterized• Structure Activity Landscape Index (SALI)
• Cliffs are characterized by elements of the matrix with very large values
€
SALIi, j =Ai − A j
1− sim(i, j)
Guha, R.; Van Drie, J.H., J. Chem. Inf. Model., 2008, 48, 646–658
![Page 7: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/7.jpg)
Fingerprints
• Lots of types of fingerprints • Indicates the presence or absence of a structural
feature • Length can vary from 166 to 4096 bits or more • Fingerprints usually compared using the Tanimoto
metric
![Page 8: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/8.jpg)
Visualizing the SALI Matrix
![Page 9: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/9.jpg)
Visualizing SALI Values
• Alternatives?– A heatmap is an easy to understand visualization– Coupled with brushing, can be a handy tool– A more flexible approach is to consider a network
view of the matrix • The SALI graph– Compounds are nodes– Nodes i,j are connected if SALI(i,j) > X– Only display connected nodes
![Page 10: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/10.jpg)
Visualizing the SALI Graph
• Nodes are ordered such that the tail node in an edge has lower activity than the head node
![Page 11: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/11.jpg)
Varying the Cutoff
• The cutoff controls the complexity of the graph
• Higher cut offs will highlight the most significant activity cliffs
![Page 12: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/12.jpg)
Varying Fingerprint Methods
• Shorter fingerprints will lead to more “similar” pairs• Requires a higher cutoff to focus on significant cliffs
![Page 13: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/13.jpg)
Varying the Similarity Metric
![Page 14: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/14.jpg)
Different Molecular Representations
• The nature of the representation can significantly affect the landscape
• SALI matrices for a benzodiazepine dataset, generated using a 2D and a 3D representation
Sutherland, J.J. et al., J. Chem. Inf. Comput. Sci., 2003, 43, 1906–1915
![Page 15: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/15.jpg)
Different Activity Representations
• Using the Hill parameters from a dose-response curve represents richer data than a single IC50
€
S0
Sinf
AC50
H
⎧
⎨ ⎪ ⎪
⎩ ⎪ ⎪
⎫
⎬ ⎪ ⎪
⎭ ⎪ ⎪
€
SALIi, j =d(Pi,Pj )
1− sim(i, j)
![Page 16: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/16.jpg)
SALI Curves from DRCs
• No difference in major cliffs• Some of the minor cliffs are highlighted using
the DRC instead of IC50
IC 50 DRC
![Page 17: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/17.jpg)
Better Visualization - SALIViewer
http://sali.rguha.net
![Page 18: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/18.jpg)
Glucocorticoid Inhibitors
• 62 dihydroquinoline derivatives• IC50’s reported, some values were censored• 50% SALI graph generated using 1052 bit BCI
fingerprints
Takahashi, H. et al, Bioorg. Med. Chem. Lett., 2007, 17, 5091–5095
![Page 19: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/19.jpg)
Glucocorticoid Inhibitors
• 62 dihydroquinoline derivatives• IC50’s reported, some values were censored• 50% SALI graph generated using 1052 bit BCI
fingerprints
Takahashi, H. et al, Bioorg. Med. Chem. Lett., 2007, 17, 5091–5095
![Page 20: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/20.jpg)
Glucocorticoid Inhibitors
• Moving from ally or phenylethyl to ethyl causes a 6-fold increase in activity
• Reducing bulk at this position seems to improve activity
• But ethyl is not much smaller than allyl
• We need more detail
07-20 2000 nM
07-23 2000 nM
07-17 355 nM
![Page 21: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/21.jpg)
Glucocorticoid Inhibitors
Generated using a 30% cutoff
![Page 22: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/22.jpg)
Glucocorticoid Inhibitors
• Suggests that electrondensity is also important
• Lower π density possibly correlates to increased activity
• Confirmed by 07-23 → 07-18• 07-15 → 07-17 is interesting
since the change increases the bulk
07-20 2000 nM
07-17 355 nM
07-18 710 nM
07-15 2000 nM
![Page 23: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/23.jpg)
Glucocorticoid Inhibitors
• These observations match those made by Takahashi et al.
• More detailed graphs exhibit longer paths that focus on the bulk of side chains at the C4–α position
• A number of paths consider changes to the epoxide substitution– Usually of length 1– Highlights the fact that bulk at the C4–α has greater
impact on activity than epoxide substitutions• The SALI graph stresses the non-linearity of the SAR
![Page 24: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/24.jpg)
SALI Graphs & Predictive Models
• The graph view allows us to view SAR’s and identify trends easily
• The aim of a QSAR model is to encode SAR’s• Traditionally, we consider the quality of a model in
terms of RMSE or R2
• But in general, we’re not as interested in RMSE’s as we are in whether the model predicted something as more active than something else – What we want to have is the correct ordering– We assume the model is statistically significant
![Page 25: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/25.jpg)
Measuring Model Quality
• A QSAR model should easily encode the “rolling hills”• A good model captures the most significant cliffs• Can be formalized as
How many of the edge orderings of a SALI graph does the model predict correctly?
• Define S (X ), representing the number of edges correctly predicted for a SALI network at a threshold X
• Repeat for varying X and obtain the SALI curve
![Page 26: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/26.jpg)
SALI Curves
![Page 27: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/27.jpg)
QSAR Model Comparisons
• QSAR has traditionally used statistical or machine learning methods
• But ‘Q’ is for quantitative – lots of ways to get a quantitative model
• A model can encode a SAR in twoways– Implicit (via surrogates)– Explicit (via a physical model)
![Page 28: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/28.jpg)
QSAR Model Comparisons
Derived from a docking model Derived from a pharmacophore model
Holloway, K. et al, J. Med. Chem., 1995, 38, 305–317Cavalli, A. et al, J. Med. Chem., 2002, 45, 3844–3853
![Page 29: A Network Visualization of Structure Activity Landscapes](https://reader030.vdocuments.site/reader030/viewer/2022013003/554b8f22b4c905463d8b458f/html5/thumbnails/29.jpg)
Acknowledgements
• John Van Drie• Gerry Maggiora• Mic Lajiness• Jurgen Bajorath