hydrogen bond propensities -...
TRANSCRIPT
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
UNIVERSITY OF CAMBRIDGE
Peter T.A. Galek • Cambridge Crystallographic Data Centre
24th March 2009 – ACS Spring Meeting, Salt Lake City, USA
Hydrogen Bond Propensities:Knowledge-based Predictions to Aid Pharmaceutical Solid Form Selection
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
2
Outline
► Present H-bond propensity model screens for 2 major APIs
► How the theory is applied
The role of propensity screens in drug development
► What is an H-bond propensity model?
► Future Directions
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
3
Structural Stability
• What structural features are conserved in comparing two modifications?– Atom/group interactions
– Conformational changes?
– Hydrogen bonding?
– Other comparisons??
P.T.A. Galek, L. Fabian & F. H. Allen, 2009. Acta Cryst B65, 68-85
SpiperoneFBPAZD : FBPAZD01
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
4
Predicting H-Bonds?• Assumption:
unusual/metastable forms can display weaker (≡less likely) hydrogen bonding
• Often subtle contrasts (e.g. strong synthon always observed, change in a third interaction upon polymorphism)
Feb ’09: #472,200
• Use all H-bonds as polymorph screen: create flexible measure of crystal stability
• Apply CSD as H-bond knowledge base.
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
5
Predicting H-bonds?
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell,Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
“Predict which donors and acceptors
form hydrogen bonds in a crystal structure…”
“…Identify both likely and unusual
hydrogen bonding.”
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
6
Predicting H-bonds?
• Model H-bonds as binary (true/false) distribution
– Pairs that do and do not interact
– Logit model
– Built on set of descriptors- try to capture the ”influential chemistry”
– Regression procedure optimises their contribution
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
7
Creating the Tool
Process Data
True/False?
Molecular descriptors?
Start: Structure Analysis
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
8
Model Descriptors:
• For the best predictivity, influences on H-bond formation need to be well described
• Descriptors are key to the method– H-Bond competition– Atom accessibility/
Steric effects– D & A chemical type– π- stacking/ donor- π interactions– …
• Influences change per system– Descriptors redundant– Highly Correlated– Needs for flexibility
• Aim for truly predictive approach
Capturing Influences
Caffeine Theobromine
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
9
Creating the Tool
Process Data
True/False?
Molecular descriptors?
f (π)
Logistic
Regression
Model
Start: Structure Analysis
ProcessObservations
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
10
• Model equation approximates log odds of prediction.– Assumes linear function
Linking Observations to Predictions
⎟⎟⎠
⎞⎜⎜⎝
⎛
−= i
kc
ikci
kc,
,, 1
log)logit(π
ππ ∑+=
kk
ikx βα
• π predictions obtained from inversion of logit. Gives function:
)exp(1
)exp(
∑∑++
+=
kkk
kkk
x
x
βα
βαπ
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell,Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
11
Critical Assessment
• Model statistics
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
12
Creating the Tool
Process Data
True/False?
Molecular descriptors?
f (π)
Logistic
Regression
A vs. B?
Screening
Model
Start: Structure Analysis
Target Prediction
ProcessObservations
In Silico & In Vitro
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
13
Application I: Indomethacin
• 2 known polymorphic forms characterised in literature.
– Archived in CSD
– INDMET02, INDMET03
• Predict interactions with the methodology using existing crystal structures…
Identify donors and acceptors: carboxyl, tertiary amide, methoxy, chlorine
Generate related crystal set: 1333 CSD structures.
Extract model data: 1616 H-bond observations (35% true; 65% false). Convergence after 9 iterations. AUC = 84.2%. 3% predictivity loss under 30% random hold-out validation
Perform predictions on target…
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
14
Indomethacin: Predictions
• Carboxyl interaction clearly most likely (suggests 2-fold ring-motif )
• There is a 50:50 chance that the amide will accept
• The remaining acceptors have a low likelihood
• Individual parameters show that the accepting ability of carboxyl and amide is ~same – coeffs.= 1.273,1.217
• Sterics around acceptors differentiate in this target: – 1.9, 3.167.
• Expect γ form most stable…
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
15
Form γ P21
Form α
P-1
• α form converts to γform at 152-154C
• TGA and DSC analysis
• Melting point γform (160-161C)
Lin S.-Y. (2006). J. Pharm. Sci.6, 572-576.Chen, X. et al (2002). J. Am. Chem. Soc., 124, 15012-15019.
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
16
Indomethacin: Form I vs. II
• Five potential hydrogen bonds; 1 acceptor dominates, 2 acceptors very weak
• Stable structure likely to have strongest carboxyl donor and acceptor paired.
• Next most likely pairing (carboxyl-amide) denies strongest acceptor.– High Z’ structures can allow combinations
• Seem rather suboptimal– Other Z’ =1 structures very unlikely
• γ Form with most probable H-bonding is revealed as most stable form
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
17
Which form is likely to be more stable?
Application II: Ritonavir
• Pair-wise predictions:
Generate related crystal set: 836 CSD structures.
Extract model data: 8731 H-bond observations (36% true; 64% false). Convergence after 9 iterations. AUC = 83.2%. 1% predictivity loss under 40% random hold-out validation
Perform predictions on target…
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
18
Form I vs. Form II…
Galek et al. Angew. Chem. Int. Ed. 2009 submitted
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
19
Form I vs. Form II H-bond Geometries
Galek et al. Angew. Chem. Int. Ed. 2009 submitted
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
20
Ritonavir: Form I vs. Form II
• Many likely pairings. Unlikely pairings are indicators in this system.• 2 low propensity H-bonds in Form I suggest alternative arrangement
possible.• Given form II structure discovered displaying probable and stable H-bonds,
likely more stable• Form I is now known as kinetic form, thermodynamics governs
disappearance• Form II with most probable H-bonding is revealed as most stable form
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
21
Summary• Relative polymorphic stability of 2 APIs correctly predicted by
comparing H-bond propensities
• Chemically unrelated examples- flexible method has diverse potential application– E.g. cocrystals; salts; large, flexible targets
• Inherent uncertainty estimation – Important for risk analysis
• Scope for further development and application– Intramolecular H-bonding, simplifying data extraction, linking
existing CCDC product base
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
22
Thank you
Collaborators: László Fábián CCDC, Pfizer Institute & CCDC
Frank Allen, Sam Motherwell CCDC
Neil Feeder Pfizer Global R&D
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
23
Model Descriptors
• Functional group labels
a: amino
b,c: amido
d: carboxyl
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell, Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
24
Model Descriptors
• Functional group labels
• Competition
e.g. primary amide-carboxyl(=O)
κc(N3, O2) = (5+7) = 3(2+2)
ai
c c ccc AD
ADai
+
+= ∑ ∑),(κ
1
1
2
2
3
4
3
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell, Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
25
Model Descriptors
• Functional group labels
• Competition
• Donor- and Acceptorsteric density
e.g. indole
ρA(N1)= 10 / 7 = 1.43
ai
c c ccc AD
ADai
+
+= ∑ ∑),(κ
jc
jcc r
i∉
∉Σ=)(ρ
1
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell, Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell, Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
www.ccdc.cam.ac.uk
UNIVERSITY OF CAMBRIDGE
26
Model Descriptors
• Functional group labels
• Competition
• Donor- and Acceptorsteric density
• Aromaticity
a= 10 / 22 = 0.4545ai
c c ccc AD
ADai
+
+= ∑ ∑),(κ
jc
jcc r
i∉
∉Σ=)(ρ
.
.)(pot
aromc bonds
bondsiaΣΣ
=
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell, Frank H. Allen and Neil Feeder, Acta Cryst B63, 768-782,2007
Peter T. A. Galek, László Fábián, W. D. Samuel Motherwell, Frank H. Allen and
Neil Feeder, Acta Cryst B63, 768-782,2007