dinesh gupta structural and computational biology group … · dinesh gupta structural and...

63
Insilico drug designing Dinesh Gupta Structural and Computational Biology Group ICGEB

Upload: doandien

Post on 22-Apr-2018

219 views

Category:

Documents


3 download

TRANSCRIPT

Insilico drug designing

Dinesh GuptaStructural and Computational Biology GroupICGEB

Modern drug discovery process

Target

identification

Target

validation

Lead

identification

Lead

optimization

Preclinical

phase

Drug

discovery

2-5 years

• Drug discovery is an expensive process involving high R & D cost and

extensive clinical testing

• A typical development time is estimated to be 10-15 years.

6-9 years

Drug discovery technologies

• Target identification– Genomics, gene expression profiling and proteomics

• Target Validation– Gene knock-out, inhibition assay

• Lead Identification– High throughput screening, fragment based screening, combinatorial

libraries

• Lead Optimization– Medicinal chemistry driven optimization, X-ray crystallography, QSAR,

ADME profiling (bioavailability)

• Pre Clinical Phase– Pharmacodynamics (PD), Pharmacokinetics (PK), ADME, and toxicity

testing through animals

• Clinical Phase– Human trials

Identify and validate target

Clone gene encoding target

Rational Approach to Drug Discovery

Express target

Synthesize modified lead compounds

Crystal structures/MM of target and target/inhibitor complexes

Preclinical trials

Identify lead compounds

Toxicity & pharmacokinetic studies

Bioinformatics tools in DD

• Comparison of Sequences: Identify targets

• Homology modelling: active site prediction

• Systems Biology: Identify targets

• Databases: Manage information

• In silico screening (Ligand based, receptor based): Iterative steps of Molecular docking.

• Pharmacogenomic databases: assist safety related issues

Published by AAAS

J. Drews Science 287, 1960 -1964 (2000)

Currently used drug targets

This information is used by bioinformaticians to narrow the search in the groups

Insilico methods in Drug Discovery

• Molecular docking

• Virtual High through put screening.

• QSAR (Quantitative structure-activity relationship)

• Pharmacophore mapping

• Fragment based screening

Molecular Docking

RL

• Docking is the computational determination of binding

affinity between molecules (protein structure and ligand).

• Given a protein and a ligand find out the binding free

energy of the complex formed by docking them.

L

R

Molecular Docking: classification

• Docking or Computer aided drug designing can be

broadly classified

– Receptor based methods- make use of the structure of the target

protein.

– Ligand based methods- based on the known inhibitors

Receptor based methods

• Uses the 3D structure of the target receptor to search for the potential candidate compounds that can modulate the target function.

• These involve molecular docking of each compound in the chemical database into the binding site of the target and predicting the electrostatic fit between them.

• The compounds are ranked using an appropriate scoring function such that the scores correlate with the binding affinity.

• Receptor based method has been successfully applied in many targets

Ligand based strategy

• In the absence of the structural information of the target,

ligand based method make use of the information

provided by known inhibitors for the target receptor.

• Structures similar to the known inhibitors are identified

from chemical databases by variety of methods,

• Some of the methods widely used are similarity and

substructure searching, pharmacophore matching or 3D

shape matching.

• Numerous successful applications of ligand based

methods have been reported

Ligand based strategy

Search for similar compounds

database known actives structures found

Binding free energy

• Binding free energy is calculated as the sum of the following energies

- Electrostatic Energy

- Vander waals Energy

- Internal Energy change due to flexible deformations

- Translational and rotational energy

• Lesser the binding free energy of a complex the more stable it is

Basic binding mechanism

Complementarities between the ligand and the

binding site:

• Steric complementarities, i.e. the shape of the

ligand is mirrored in the shape of the binding site.

• Physicochemical complementarities

Components of molecular docking

A) Search algorithm

• To find the best conformation of the ligand

and the protein system.

• Rigid and flexible docking

B) Scoring function

• Rank the ligands according to the interaction energy.

• Based on the energy force-field function.

Success with vHTS

• Dihydrofolate reductase inhibitor (1992)

• HIV-protease (1992)

• Phospholypase A2 (1994)

• Thrombine (1996)

• Carbonic anhydrase inhibitors(2002)

Virtual High Throughput Screening

• Less expensive than High Throughput Screening

• Faster than conventional screening

• Scanning a large number of potential drug like

molecules in very less time.

• HTS itself is a trial and error approach but can be

better complemented by virtual screening.

QSAR

• QSAR is statistical approach that attempts to relate

physical and chemical properties of molecules to their

biological activities.

• Various descriptors like molecular weight, number of

rotatable bonds LogP etc. are commonly used.

• Many QSAR approaches are in practice based on the

data dimensions.

• It ranges from 1D QSAR to 6D QSAR.

Pharmacophore mapping

• It is a 3D description of a pharmacophore, developed by specifying the nature of the key pharmacophoric features and the 3D distance map among all the key features.

• A Pharmacophore map can be generated by superposition of active compounds to identify their common features.

• Based on the pharmacophore map either de novo design or 3D database searching can be carried out.

Modeling and informatics in drug design

Increased application of structure based drug designing is facilitated by:

Growth of targets number

Growth of 3D structures determination (PDB

database)

Growth of computing power

Growth of prediction quality of protein-

compound interactions

Summary: role of Bioinformatics?

• Identification of homologs of functional

proteins (motif, protein families, domains)

• Identification of targets by cross species

examination

• Visualization of molecular models

• Docking, vHTS

• QSAR, Pharmacophore mapping

Example: use of Bioinformatics in Drug discovery

Identification of novel drug targets against human malaria

Malaria – A global problem!

• Malaria causes at least 500 million clinical cases and more than one million deaths each year.

• A child dies of malaria every 30 seconds.

• Out of four Plasmodium species causing human malaria, P.falciparum poses most serious threat: because of its virulence, prevalence and drug resistance.

• Malaria takes an economic toll - cutting economic growth rates by as much as 1.3% in countries with high disease rates.

• There are four types of human malaria:– Plasmodium falciparum

– Plasmodium vivax

– Plasmodium malariae

– Plasmodium ovale.

• Approximately half of the world's population is at risk of malaria, particularly those living in lower-income countries.

• Today, there are 109 malaria affected countries in 4 regions

a) Chloroquine

b) Quinine

c) Artemether

d) Sodium artesunate

e) Dihydroartemisinin

f) Pyrimethamine

g) Sulfadoxine

h) Mefloquine

i) Halofantrine

j) Primaquine

k) Tafenoquine

l) Chlorproguanil

m) Dapsone

Chemical structures of drugs in widely used for treatment of Malaria

http://malaria.who.i

nt/docs/adpolicy_t

g2003.pdf

Problems with the existing drugs

• Drug resistance is most common problem

• Adverse effects (Shock and cardiac arrhythmias

caused by Chloroquine)

• Poor patient compliance (Quinine tastes very

unpleasant, causes dizziness, nausea etc.)

• High cost of production for some effective drugs

(Atovaquine).

• Urgent need for identification of novel drug

targets which are effective and affordable.

Strategies for drug target identification in P. falciparum

• Parasite culture for functional assays are difficult and expensive.

Making computational approaches more relevant.

• Malaria remains a neglected disease- very few stake holders!

• Availability of the genomic data of P.falciparum and H.sapiens has

facilitated the effective application of comparative genomics.

• Comparative genomics helps in the identification and exploitation of

different characteristic features in host and the parasite.

• Identification of specific metabolic pathways in P.

falciparum and targeting the crucial proteins is an attractive approach

of target based drug discovery.

Comparison of proteomes helps in identifying important indispensible parasite proteins

• Out of 5334 predicted

proteins in P. falciparum,

60% didn’t show any

similarity to known proteins.

• Hence assigning a

physiological functional role

to these hypothetical

proteins using

bioinformatics approach still

remains a challenge.

A. gambiae

P. falciparum H. sapiens

Predicted

proteome

Large set of proteins with no/low

similarity

Novel drug target identification in P.falciparum

BlastP

~40% identity threshold for

three-dimensional

modeling

Relational

Database of

homology

models

476 P.falciparum

proteins

Human

proteome

Putative drug

targets in

P.falciparum

Comparative genomics studies

Literature search for all these proteins

Check for physiological and biochemical

functions; etc ..

Proteasome

machinery (ClpQY

and ClpAP) in

P.falciparum

Targets identified by comparison of proteins models

• Identification of two proteasomal proteins

of prokaryotic origin, not present in hosts.

• The protein degradation is an important

process in parasite development inside

host RBCs.

26S proteasome: eukaryotic type

•19S regulatory + 20S proteolytic particle

•Present only in Eukaryotes and archae

•Degrades ubiquitinated proteins

> 20 different proteins involved

20S proteasome

ClpQY system: prokaryotic type

•ClpY cap + ClpQ core particle

•Present only in prokaryotes

•No ubiquitination in prokaryote

•Substrate specificity is not known

•Only two proteins ClpQ & ClpY

Eukaryotic and prokaryotic proteasome machinery

ClpQ

ClpY

ClpYSubstrate protein

Peptides

ATP Dependent Protease Machinery

ClpQY (PfHslUV system)

• The HslUV complex in prokaryotes is composed of an

HslV threonine protease and HslU ATP-dependent

protease, a chaperone of Clp/Hsp100 family.

• HslV (ClpQ) subunits are arranged in form of two-stacked

hexameric rings and are capped by two HslU (ClpY)

hexamers at both ends.

• HslU (ClpY) hexamer recognizes and unfold peptide

substrates with an ATP dependent process, and

translocates them into HslV for degradation.

Crystal structure of HslUV complex

in H. influenzae

PfClpQY complex model in

P. falciparum

ATP Dependent Protease

machineries ClpQY (PfHslUV system)

• The HslUV complex in prokaryotes is composed of an HslV threonine protease and ATP-dependent protease HslU, a chaperone of clp/Hsp100 family.

• HslV subunits are arranged in the form of two-stacked hexameric rings and are capped by two HslU hexamers at both ends.

• In an ATP dependent process, HslU hexamer recognizes and unfold peptide substrates and translocate them into HslV for degradation.

MFIRNFVNIIGSQKSITKTIARNYFSDNSKLIIPRHGTTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFETKIDEYPNQL

LRSCVELAKLWRTDRYLRHLEAVLIVADKDILLEVTGNGDVLEPSGNVLGTGSGGPYAMA

AARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL

For full length & matured active protein

Length : 207 aa (170)

Pro domain : 37aa

Important motifs found:

•TT at N terminal in mature protein

•GSGG common chymotrypsin protease signal.

•Lys(28) and Arg(35) are two conserved amino acids play some role in the activity.

PfClpQ component

PK_ClpQ TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE

PV_ClpQ TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE

PF_ClpQ TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE

PY_ClpQ TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE

PB_ClpQ TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE

************************************************************

PK_ClpQ TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDVLLEVTGNGDVLEPSGNVLG

PV_ClpQ TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDVLLEVTGNGDVLEPSGNVLG

PF_ClpQ TKIDEYPNQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDILLEVTGNGDVLEPSGNVLG

PY_ClpQ TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDTLLEVTGNGDVLEPSGNVLG

PB_ClpQ TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDTLLEVTGNGDVLEPSGNVLG

*******:******************************** *******************

PK_ClpQ TGSGGPYAIAAARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL

PV_ClpQ TGSGGPYAIAAARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL

PF_ClpQ TGSGGPYAMAAARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL

PY_ClpQ TGSGGPYAMAAARALYDIENLSAKDIAYKAMNIAADMCCHTNHNFICETL

PB_ClpQ TGSGGPYAIAAARALYDIENLSAKDIAYKAMNIAADMCCHTNHNFICETL

********:********:************************:*******

Homologs of PfClpQ protein in other Plasmodium spp

PfClpQ

1kyi

Conservation of catalytic residues

S125-G45-T1-K33

Homology modeling of PfClpQ

Structural alignment of PfClpQ and HslV

(H.influenzae)

E. coli S. enterica H. influenzae X. campestris W. pipientis P. falciparumT. brucei T. cruzi L. infantum

E. coli S. enterica H. influenzae X. campestris W. pipientis P. falciparumT. brucei T. cruzi L. infantum

E. coli S. enterica H. influenzae X. campestris W. pipientis P. falciparumT. brucei T. cruzi L. infantum

Homology Modeling of PfClpQ

•Most of the conserved residues in different bacterial species

were either identical or similar in PfClpQ

Km =19.18 mM

Cbz-GGL-AMC

Lactacystin

Activity assay for PfClpQ protein

0

50

100

150

1h 2h 3h 4h 5h 6h

Time

Threonine protease likeSubstrate:

Inhibitor:

Biochemical characterization of PfClpQ proteinA

MC

rel

ease

d (

m m

ole

s)

Substrate conc (mM)

Km = 58.22 mM Km =37.79 mM

Chymotrypsin like

Suc-LLVY-AMC

chymostatin

Peptidyl glutamyl hydrolase

Z-LLE-AMC

MG132

0

100

200

300

400

500

30 60 90 120 150 180

Time in minutes

0

50

100

150

1h 2h 3h 4h 5h 6h

Time

AM

C r

elea

sed

(m

mo

les)

AM

C r

elea

sed

(m

mo

les)

Substrate conc (mM) Substrate conc (mM)

Fluorogenic

peptide

substrate

Fluorescence

Protease

Phe46

Arg36

Val21

Gly49

Gly48

Ser22

Thr2

Thr50

ClpQ interaction with ligand identified by virtual screening

Crystal structure of

HslV complexed

with a vinyl sulfone

inhibitor

Compound Gold

Score

Flexx

score

Chemical Structure

1 52.54 -25.14

2 54.76 -17.37

3 54.66 -24.43

4 52.84 -24.47

A regulatory component of ClpQY system

Recognizes the substrate; unfolds the substrate; feeds it

into the degradation machine (ClpQ)

Belongs to AAA+ family of proteins

Identification of P. falciparum ClpY (PfClpY) gene

PfClpY

~1.3 kb

Contain all the three

ClpY domains- N, I and C N-Domain

C-Domain

I-Domain

N I CNDOMAINS

Walker A Walker B

ATPase domain

ClpY

ClpY

ClpQ

Variation in I domain:

plays role in recognition of

different substrate

Homology of PfClpY protein with homologs in other organisms

Targeting the ClpQY interaction

Crystal structure of HslUV in H. influenzae Modeled ClpQY interaction in P.falciparum

J Biomol Struct Dyn. 2009 Feb;26(4):473-9

EXTRACTING THE

MICROARRAY DATA FROM

NCBI GEO

NORMALIZATION IF NECESSARY

OTHERWISE PREPARING EXCEL

FILES FOR WGCNA ANALYSIS

EXCEL SHEET OF NORMALIZED DATA AND GENE SIGNIFICANCE

ANALYSING THESE FILES IN R

LANGUAGE AND RUNNING THEM IN

ANOTHER R PACKAGE –”WGCNA”

PRINCIPLE BEHIND CONSTRUCTING NETWORK IS THAT THE GENES

WHICH ARE CO-EXPRESSED, RELATED AND CAN BE CONNECTED

TO MAKE A NETWORK , USING PEARSON CORRELATION

COEFFICIENT

VISUALIZATION OF

NETWORKS BY DIFFERENT

GRAPHS AND SOFTWARE IN R

PACKAGE

FINDING DIFFERENT HUB GENES AND MODULES WHICH CAN BE USED AS

DRUG TARGET BY REFERING TO THESE NETWORKS

IDENTIFICATION OF DRUG TARGETS USING INTERACTION NETWORKS

THESE NETWORKS CAN BE USED FOR FINDING THE DRUG

TARGETS

THESE CAN ALSO BE USED FOR ANNOTATION OF PROTEINS AND

GENES BY COMPARING THEM BY INTERACTOME STUDIES

THESE NETWORKS CAN BE USED FOR PATHWAY ANNOTATION

BETTER THAN OTHER STUDIES AS THEY ARE BASED ON THE

MICROARRAY DATA

Tools used:

• Sequence analysis: Pairwise and multiple

sequence alignments, Pfam.

• Molecular modelling: Modeller

• Docking: Tripos FlexX, GOLD, Arguslab

• PP network: R package and Visant

Molecular docking hands on

• Download and install Arguslab in windows

• Load a PDB file, practice Arguslab tools

• Follow the tutorial at

http://www.arguslab.com/tutorials/tutorial_

docking_1.htm

Molecular Docking using Argus lab: Ex : Benzamidine inhibitor docked into Beta Trypsin

Create a binding site from bound ligand

Setting docking parameters

Analyzing docking results

Polypeptide builder.