data drivenapproach to medicinalchemistry

50
Data-Driven Approaches to Medicinal Chemistry How Large-Scale Normalized Data Empowers Drug Discovery RICT 2015 Drug Discovery and Selection Barberan Olivier Senior Product Manager Reaxys Medicinal Chemistry July 1

Upload: ann-marie-roche

Post on 14-Aug-2015

88 views

Category:

Health & Medicine


2 download

TRANSCRIPT

Data-Driven Approaches to Medicinal Chemistry How Large-Scale Normalized Data Empowers Drug Discovery

RICT 2015

Drug Discovery and Selection

Barberan Olivier

Senior Product Manager Reaxys Medicinal Chemistry

July 1

The Lead optimization Chalenge :

Optimization of early subtances to potential drug

2

Potency &

Selectivity

DMPK Properties

Physical properties

Safety

pharmacology

Opportunity : Knowledge-driven drug design using structure activity

relationship knowledge base

3

• General descriptor-property

relationships

• Sub-structural alerts

• QSAR

• Matched Molecular Pair analyses

• Predictive pharmacology

Etc…

Cumming, J.G., Davis, A.M. et al. Nat. Rev. Drug Disc. (2013) 12, 948–962

Data Knownledge Predictions

• Data normalization

• Taxonomies

• Quality control

Etc…

“Those who cannot remember the past are condemned to repeat it.” George Santayana: Life of Reason, Reason in Common Sense, Scribner's, 1905, page 284

4

• Integration of high value Data Sources supporting Lead Finding and lead

Optimization

4

Elsevier Solution for Lead Optimization : Reaxys medicinal Chemistry

Transform

Load

Extract

• Substances : 1M

• Biological results : 3.5 M

• Substances : 3.5 M

• Biological results : 8 M

• Substances : 4.2 M

• Biological results : 22 M

• Substances : 6 M

• Biological results : 29 M

Data Normalization (parameters, Units etc…)

Structure Normalization

Taxonomies (Targets, Sepecies, Cell lines,

Tissues/organs, bioassays)

Reaxys Medicinal Chemistry Coverage

Substances

Chemical structure ,Name, code, synonym of compound, calculated physchem

properties (log P, HBA, HBD, PSA, RotB), Lipinsky rules

Druggable target

Explore Target affinity patterns of chemical compounds

In vitro and Cell Based assays

In vitro assays (binding, second messenger etc..) and Cell based assays for

example : Aggregation, Angiogenesis, Apoptosis, Cell differentiation, Cellular Cycle

Animal models disease

Zucker rats for obesity model, ovariectomized rat in osteoporosis, treatment of

glaucoma, Xenografted animals with tumors to test antineplastic drugs

Pharmacokinetic and ADME Properties

Metabolic stability, Intrinsic clearance, Half life of elimination, Bioavailability, In

vivo Clearance

Toxicity

Cytotoxicity, cardiotoxicity, chronic toxicity

Reaxys Medicinal Chemistry : Journals coverage

6

6

• 345 000 articles are included in Reaxys Medicinal Chemistry

• corresponding to >5000 Journals from 1980 to present.

• Some articles stored in Reaxys Medicinal Chemistry are older than 1980.

• Elsevier and others publishers are covered.

• Medicinal chemistry journals are the cornerstone of Reaxys Medicinal chemistry but not

only pharmacology, biology and Chemistry journals are also included.

Reaxys Medicinal Chemistry : Patent excerption examples

7

8

Chemical diversity per target : JAK3 and NPY5

JAK3 Substances Diversity (B&M Scaffolds)

RMC (patent only) 43365 16045 RMC (Articles only) 3283 2199 RMC 45715 17828 chEMBL 2443 1490

NPY5 Substances Diversity (B&M Scaffolds)

RMC (patent only) 12698 5700 RMC (Articles only) 2537 1014 RMC 14544 5963 chEMBL 1483 652

95%

+914%

+1196%

90%

- Patents increase the chemical diversity by around 1000% versus articles only

- Patents represent around 90% of the overall chemical diversity

Putting Data to Work |

Hit to lead : Virtual screening

Putting Data to Work | 10

Ligand Based virtual Screening – Using Reaxys Medicinal CHemistry

Objective

• Describe an In Silico Screening approach using Reaxys Medicinal Chemistry

Case Study on T-Type calcium channels

Putting Data to Work | 11

Ligand-Based In Silico Screening

Simple Target name search returns all results

Filter on active compound pX>7

ANSWERS 130 compounds and 1200 experimental data

Putting Data to Work | 12

Ligand-Based In Silico Screening

130 Query structures

Flat file

Representation & Chemical Space Molecular descriptors & Fingerprints

Virtual Screening Pharmacophoric Similarity

N

O

N

NN

O

N

N

N

314 Hits

"Drug-like" Filtering

1. Molecular diversity and chemical originality 2. Compounds availability

39 compounds ordered for testing

Putting Data to Work |

FEATURES

The Reaxys Medicinal Chemistry Flatfile

• Substance information (~ 26 million substances)

• The substances are delivered as a series of SD files containing all structures from Reaxys in Molfile format together with their identification data and a list of available facts and reactions for each compound

• Unstructured substances are included as empty Molfiles

• Bioactivity data (> 29 million bioactivity data points)

• The bioactivity data are delivered as a series of linked data files in XML format, using the Resource Description Framework (RDF), compliant with the OpenPHACTS guidelines

• The XML files contain information on bioassays, citations, bioactivity data points, substance facts and bioactivity targets

• This includes pharmacokinetic and ADME property data, toxicity data

Substance information

Putting Data to Work | 14

Biological activity

Electrophysiology experiments: Screening @10 µM on Cav3.2 T-Type channels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 29 30 31 32 33 34 35 36 37 38 39 0

25

50

75

100

Pea

k cu

rren

t in

hib

itio

n (

%)

28

9 compounds with a % inhibition > 75% 15 compounds with a % inhibition >50%

Compound # (@ 10µM)

ADMET Properties influencing medicinal Chemistry design

Prediction of ADMET properties

Influencing medicinal chemistry design

• logD7.4

• Protein Binding

• Solubility

• Metabolic

Stability

• hERG

• Etc…

Step1

•Rat PPB

• Hu heps

• CYP inhib

• Caco2

• NaV1.5

• Etc…

Step2

• logD7.4

• Solubility

•Protein Binding

• hERG

• Rat PPB

• Metabolic

Stability

• CYP inhib

• Caco2

• etc.

Step 0

AstraZeneca’s global HERG QSAR model70 has contributed to the

reduction in the synthesis of ‘red flag’ compounds (compounds that are

measured to have an HERG potency of <1μM)

from 25.8% of all compounds tested in 2003 to only 6% in 2010.

Cumming, J.G., Davis, A.M. et al. Nat. Rev. Drug Disc. (2013) 12, 948–962

Predictive modeling of chemical solubility

Case Study: Solubility Modeling

18

Overview

• Reaxys has an impressive amount of data about compounds that was reported in literature

• Using the Reaxys API one can access the data and create predictive models

• This example uses aqueous solubility as reported in literature, this is not the intrinsic solubility for

neutral molecules, this is whatever the authors reported which is generally at neutral pH.

• Every value has a reference that you could read to verify where the value came from

Model Making Process

• Extracted 6893 reported aqueous solubilities in g/L reported in Reaxys

Converted reported values to molarity using molecular weight, computed logS

Averaged values when there were multiple reports for each compound

• Created a KNIME workflow to do the analysis

• Used CDK molecular descriptors and R to make simple solubility model

Simple multiple regression model “lm” in R

• Also created a model that reports the solubility of the most similar compound

This has been reported to be surprisingly effective!

This works well for Reaxys because of the large number of compounds with solubilities

Relevant data where and when they are needed

ELSS content integrated into the existing environment of tools and processes

19

Script, PipelinePilot

or KNIME node

Set of compound structures

List of target names

Patent numbers

Bioactivity values

Compound structures

Chemical properties

Input Output Search, retrieve & process element

– Visualisation,

Spotfire input

– Reporting,

dashboard

production

– Excel tables

– QSAR/QSPR

modeling

– Hit-to-lead

optimization

– Reaction

modeling

– Text mining

Further processing

Knime workflow for Solubility Modeling

20

KNIME Workflow

• Can create this in PipelinePilot

• Can auto-update with new data in Reaxys since it pulls directly from the server

Putting Data to Work |

Solubility modeling : Predicted-vs-Actual

• Predicted vs actual Log (S[M])

• Could filter for a “better” subset of compounds

• More scatter than recent work; in this framework can try various descriptors to find ones that work best

Residual standard error: 1.253 on 3437 degrees of freedom

Multiple R-squared: 0.5728, Adjusted R-squared: 0.5636

F-statistic: 62.29 on 74 and 3437 DF, p-value: < 2.2e-16

• Recent work of Yalkowsky, 1642 selected compounds.

• Used group-contribution methods

• Std error 0.8 log units

• Int J Pharm. 2008 Aug 6;360(1-2):122-47. doi: 10.1016/j.ijpharm.2008.04.028

Putting Data to Work |

Conclusions

• One can use the valuable properties reported in Reaxys for creating models

• Much more biological information available in Reaxys Medicinal Chemistry!

• All sources are referenced

• The API allows easy access to the data outside of the web user interface for models

• One can make several kinds of models and show all results, or make a consensus determination

Safety Pharmacology : avoiding hERG inhibition

Putting Data to Work | 24

Why avoiding Herg inhibition?

Putting Data to Work |

Case Study : Which are the antagonist of 5-HT2a antagonist with low affinity on herg Channel? • 5-HT2A receptor antagonism in contributing to the therapeutic effect of several clinically

effective and potential atypical antipsychotics as well as several antidepressants.

• The ability of selective 5-HT2A receptor antagonists to interfere with the heightened state of dopamine activity without altering basal tone, suggests that these drugs possess antipsychotic activity and may provide the basis for new therapies for psychosis and drug dependence.

search for 5-HT2a antagonist

search for compounds tested on Herg

Putting Data to Work | 26

26

Click on Heatmap overlay to retrieve 5-HT2 antagonist tested on Herg

Combine Hitsets

Putting Data to Work | 27

Which are the antagonist of 5-HT2a antagonist with low affinity on herg Channel?

The following Heatmap displays 99 5-HT2a antagonist tested also on Herg Channel

Putting Data to Work | 28

Which are the antagonist of 5-HT2a antagonist with low affinity on herg Channel?

Most active antagonist on 5-HT2A (~10nM) with low affinity on Herg

How to avoid erg inhibition

Putting Data to Work |

Prediction of Cardiotoxic drugs related to hERG blockade

Putting Data to Work |

Introduction

QT interval prolongation can lead to serious arrhythmias which can evolve to fatal issue.

A large number of non-cardiac drugs-induced QT prolongation has been reported and continues to increase with the withdrawn of some blockbusters medicines.

hERG seems to be the main target of this adverse side effect.

In silico models could be rapid and powerful tools to screen out potential hERG blockers as early as possible during the discovery process.

Putting Data to Work |

Extraction & Methodology

hERG data set 640 mol

Recursive Partitioning (RP) MOE QuaSAR Classify

Molecules tested on hERG (Kv11.1)

2D-Molecular descriptors sets Predictive models

Subsets of molecules According to biological detailed protocols

Representation of hERG ligands within chemical space

of NCI database according to the two first PCA axis

NCI database

hERG data set

Cross-validation

External validation

Putting Data to Work |

Model 3 HIGH WEAK

1 µM 10 µM 50 / 9 mol 50 / 46 mol

Training Test

Descriptors Relevant P_VSA Relevant P_VSA

High 48/50 47/50 45/46 42/46

Weak 49/50 49/50 8/9 9/9

All 97% 96% 96% 93%

Correct classifications determined by 5-fold cross-validation (Training) and

by external validation (Test) for each descriptor set.

SlogP_VSA7

SlogP_VSA2 SMR_VSA6

PEOE_VSA+1

+ +

+ +

+

-

- -

-

-

SMR_VSA5 SMR_VSA6

SMR_VSA5

SMR_VSA4

PEOE_VSA+0

+

-

SlogP

PEOE_VSA_FHYD

SMR_VSA1

SMR_VSA6

SMR_VSA1

SlogP vsa_pol

-

-

- - + +

Relevant P_VSA

Putting Data to Work |

Conclusion

Reaxys Medicinal Chemistry permitted to retrieve high quality dataset with chemical diversity and homogeneous biological activities.

Pertinent predictive models of hERG activity have been designed using recursive partitioning analysis with 2D-molecular descriptors.

From Reaxys Medicinal Chemistry, fast virtual screening approach could be used as early tool in the drug discovery process to avoid cardiotoxic side effect related to hERG blockade.

AstraZeneca’s global HERG QSAR model70 has contributed to the reduction in the synthesis of ‘red flag’ compounds (compounds that are measured to have an HERG potency of <1μM) from 25.8% of all compounds tested in 2003 to only 6% in 2010.

Cumming, J.G., Davis, A.M. et al. Nat. Rev. Drug Disc. (2013) 12, 948–962

Even if models are performing well designers need interpretable models

Putting Data to Work |

Mine Reaxys medicinal chemistry for Metabolic staability

Putting Data to Work | 35

How to search Metabolism of a certain Phenotype

Bioassay Category Parameter

Broad search (All parameters)

Precise search (by parameters)

Pyrrolidine versus Azetidine metabolic stability

Putting Data to Work | 36

How to Access to Metabolism details

enzyme Tissue/ Organ

Cell Fraction

Enzyme Substrate

Putting Data to Work |

37

Metabolic Stability Export

Putting Data to Work |

Show Case 1 : Pyrrolidine metabolic stability

Pyrrolidines are known to be metabolically unstable. Are there pyrrolidines out there with an intrinsic clearance in microsomes <20 µL/min/mg of protein? How do the complete structures of the compounds look like

The overall search on intrinsic clearance (ml/min/g or µL/min/mg of protein) of Pyrrolidines in Reaxys Medicinal Chemistry provides1031 Substances and 1777 clearance results. (see below) extracted from 138 citations.

Putting Data to Work |

More stable Pyrrolidine compouds

• The top 10 pyrrolidine compounds having the lowest intrinsic clearance are displayed

Putting Data to Work |

Show Case 2 : Azetidine Metabolic stability

• Different cyclic amines might be tolerated. What is known about the metabolic stability of azetidines? Are they more stable than pyrrolidines? What modifications of the azetidine are known?

Putting Data to Work |

Among the stable azetidines (clint <20 µl/min/mg of prot) the followings scaffolds were found.

What modifications of the azetidine are known ?

Putting Data to Work |

Are Azetidines more stable than pyrrolidines?

Based on the graph below displaying azetidines and Pyrrolidines Clearance results founds in RMC, it’s appear that in general Azetidines are more stable than Pyrrolidines. • 40% of the total results of Azetine clearances are below 20 µl/min/mg of prot but only 30% for Pyrrolidines.

• This is even obvious when looking into the 20 to 100 µl/min/mg of prot of Clearance range where 42% of

azetine results fall into this category but only 20% of pyrrolydines.

Putting Data to Work |

Lead optimization

exploration of structural features of a lead series of Compounds

Putting Data to Work | 44

Exploration of structural features of a lead series of Compounds

• Within NK3 the 3,4-dichlorophenyl group appears to be important as structural

feature

• Are there more target classes in which the diCl-Phe play an important role?

• Does the 3,4-di Cl Phe cause a certain activity profile?

• Are other 3,4-diX Phe structures known and what is their pharmacological

profile?

• Are there other di-substitution patterns with a strong pharmacological

response? (Other than 3,4 is meant here).

Putting Data to Work | 45

Does the 3,4-di Cl Phe cause a certain activity profile?

Substructure search for 3,4 Dichloro Phenyl Fragment

30% of the substances containing 3,4 DiCl Phenyl fragment have a bioactivity below 0,1µM

Putting Data to Work | 46

Are there more target classes in which the 3,4 diCl-Phe plays an important role

Select bioactivities below 0,1µM

• Target Profile of 3,4 DiCl Phenyl

• Target are ranked based on count of bioactivities. Yellow bars indicate the count of bioactivities below 0,1µM

Putting Data to Work | 47

Are there more target classes in which the 3,4 diCl-Phe plays an important role

Off Targets/CNS adverse Effect: Addiction/psychostimulant

Off Targets/CNS adverse Effect: Attention/perception

Off Targets/CNS adverse Effect: Learning/Memory

3,4 Dichloro Phenyl group are also involved in Off-Targets Mainly CNS related

Target Profile of 3,4 DiCl Phenyl

With an affinity below 0,1µM and having at least 100 bioactivities

Substances tested on NK3 are not tested ON other targets except one substances on Histamine 1

Putting Data to Work | 48

Are other 3,4-diX Phe structures known and what is their pharmacological profile?

3,4-DiFluoro Phenyl 3,4-Dibromo Phenyl 3,4-Dibromo Phenyl

Target Profile of 3,4 DiX Phenyl

With an affinity below 0,1µM

Putting Data to Work | 49

Are there other di-substitution patterns with a strong pharmacological response? (Other than 3,4 is meant here)

2,4-DiChloro Phenyl 2,5-DiChloro Phenyl 2,3-DiChloro Phenyl

• Canabinoid receptor (1 and 2)

Melanocortin 4

5-HT 2C Amyloid precursor protein (App)

Dopamine receptor (2 and 3) p38a

Cytochrome P450 3A4 potential Drug drug interactions

Target Profile of X,Y DiCl Phenyl.With an affinity below 0,1µM

Putting Data to Work |

Reaxys Medicinal chemistry accelerates Drug Discovery by Knowledge based Design

• Mine large datasets to find hits (virtual screening) • Mine large datasets to accelerate understanding & derive useful medicinal chemistry knowledge • Apply this knowledge to propose and evaluate new, better molecules to fulfil the multi-objective design needs of Lead Optimisation • Apply this to develop clinical candidates faster • Apply this knowlegdge base to repurpose drug.