cheminformatics platform for drug discovery application · 2011-04-18 · egi-inspire ri -261323...

37
EGI-InSPIRE www.egi.eu EGI-InSPIRE RI-261323 Cheminformatics platform for drug discovery application Hsi-Kai, Wang Academic Sinica Grid Computing EGI User Forum, 13, April, 2011 1

Upload: others

Post on 29-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

EGI-InSPIRE

www.egi.euEGI-InSPIRE RI-261323

Cheminformatics platform for drug discovery application

Hsi-Kai, WangAcademic Sinica Grid ComputingEGI User Forum, 13, April, 2011

1

Page 2: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• Introduction to drug discovery• Computing requirement of high

throughput virtual screening• Cheminfomatics case study

Page 3: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Computational chemistry /Molecular modeling

useful across the pipeline,but

very different techniques

aim for success,but if not:

fail early, fail cheap

Ref: Makus R. and Ralph W., Nature Rev. Drug Discov. (2003), 2, 123-131

Drug discovery development

Page 4: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Strategy in drug discovery

Ligand unknown Ligand known

Receptor(3D structure)

unknown

CombichemHTS

Virtual Screening

PharmacophoreSimilarity

QSAR

Receptor(3D structure)

known

Receptor-bases searchingDe novo design

Structure-based drug designReceptor-ligand interaction

Docking

4

Page 5: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• What is grid• Many definitions exist in the literature

• Foster and Kesselman, 1998. “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational facilities.”

• Grid can provide• Large scale and on-demand resources

• Computing resources (computing grids)• Storage resources (data grids)

Drug discovery on Grid (1/2)

Page 6: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

chemical compounds

Receptor structures

Molecular docking ….takes years

Data challenge on Grid….can be done in weeks

In vitroscreeningof best ~50

hits

Hits sorting and refining

Problem– Millions of compounds and drugs molecules are presently available for screening– But developing efficient assay in laboratory for such a work is time-consuming and

very expensive

Solution– Grids offer high-speed computing and huge-data managing capability– Possible variant targets can be studied quickly by present modelling applications.– This will help medicinal chemists to respond to major instant threats.

Drug discovery on Grid (2/2)

Page 7: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

GVSS, GAP Virtual Screening Service

Page 8: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

GAP Service Architecture

8

Page 9: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

•A lightweight framework for parallel scientific applications in master worker model,

•The framework takes care of all synchronization, communication, and workflow management details on behalf of application

User Application Interface

GRID environments

DIANE, DIstributed ANalysis Environment

Page 10: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• Each horizontal line segment = one task = one docking

• Unhealthy workers are removed from the worker list

• Failed tasks are rescheduled to healthy workers

the “bad” worker removed

good load balance

The profile of a DIANE job

Page 11: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• 280 DIANE worker agents were submitted as LCG jobs

• 200 jobs (~71%) were healthy– ~16 % failures related to

middleware errors– ~12 % failures related to

application errors

DIANE utilizes ~ 95% of the healthy resources

stable throughput

Efficiency and throughput of DIANE

Page 12: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

GVSS application: dengue virus

Ref: Hsin-Yen C. et al, J Grid Computing (2010), 8, 529-541

Page 13: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323 Ref: Clark G.G. , "Dengue: An emerging arboviral disease“, 2006

Worldwide dengue distribution

Areas infested with Aedes aegyptiAreas with Ae. aegypti and dengue epidemics

Page 15: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323 Ref: PDB: 2vbc (2008) J.Virol. 82: 173

H51D75

S135

Dengue NS3 protease

Page 16: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Dengue Fever Data Challenge / resources & 1st result

Total number of completed docking jobs

300,000

Estimated needed computing power

4,167 CPU*days

Duration of the experiment

60 days

Cumulative computing results

42.5 GB

Total Computing Recourses in EUAsia VO

268 Cores

Number of used Computing Elements

6

Page 17: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• Accumulating Computing Recourses in EUAsia VO: 268 cpu-cores(100 – ASGC(TW), 2 – TH, 4 - VN, 18 – MIMOS(MY), 80 – UPM(MY), 64 - CESNET(CZ))• lcg-infosites --vo euasia ce

• Registered VQS account: • 6 users (TW)• 17 user (PH, 15 in AdMU, 2 in ASTI)• 2 user (TH, 1 in NECTEC, 1 in HAII)• 1 user (MY, UPM)• 1 user (ID, ITB)• 2 user (VN, IAMI)• 1 user (FR, HealthGrid)

Joint Computing Resources & Users

Page 18: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Integration of SG & DG by EDGES

18

Page 19: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Scenario 1 –DG to SG via bridge

19

Page 20: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Scenario 2 –SG to DG via bridge

20

Page 21: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Scenario 3 –SG/DG resources but not through EDGeS

bridges

21

Job Manager

Task Manager

Page 22: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Web UI Service Architecture

22

Page 23: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Prototype Web UI Screenshot

23

Page 24: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Simulation of drug discovery workflow

24

Ligand

Protein

Preparing ligand & protein

Docking Scoring

Generating conformation Analyzing & ranking data

Page 25: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Protein Database

25Ref: PDB, http://www.rcsb.org/pdb/home/home.do

PDBbind, http://sw16.im.med.umich.edu/databases/pdbbind/index.jsp

Page 26: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• Genetic algorithm– is a search heuristic that mimics the process of natural evolution. It

generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover.

– AutoDock, GOLD…

• Molecular dynamics– is used to find poses by force-fields. The generated conformations

usually consists of a simulated annealing to locate the global optimum in a large search space.

– AMBER, CHARMM…• Shape complementarities

– is a description of the molecules, including solvent-accessible surface area, geometric constraints, H-bond, hydrophobic/hydrophilic interaction between all atoms in the complex.

– DOCK, FRED…

General class ofdocking algorithm

Page 27: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

• Force Field– affinities are estimated by intermolecular van der Waals,

electrostatic interaction et al. between all atoms of the two molecules in the complex.

– AMBER…• Empirical

– count the number of interactions and assign a score based on the number of occurrences. Example H-bond, ionic, hydrophobic/hydrophilic interaction.

– LUDI, X-Score…• Knowledge-base

– observe known protein/ligand structures, and favor interactions and geometries that are seen often.

– DrugScore, PMF…

General class ofscoring function

Page 28: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323 28Ref: AutoDock, http://autodock.scripps.edu/

X-SCORE, http://sw16.im.med.umich.edu/software/xtool/

Tools of docking and scoring

Page 29: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Simulated Condition

• Ligand and Protein• PDBBind database v2010 (3429 complexes)

• Docking• software: AutoDock• computing time: 30 ~ 50 min per docking

• ReScoring• software: X-Score• computing time: 1 ~ 2 min per scoring

29

Page 30: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Free energy in AutoDock, X-Score

Page 31: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Free energy R2 inligand molecular weight

Page 32: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Free energy R2 inprotein enzyme type

Page 33: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

RMSD in AutoDock, X-Score

33

Page 34: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Page 35: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Page 36: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Future work

• Finish implement Web-based Virtual Screening Service with EDGeSinfrastructure.

• The 691 proteins x 691 ligands docking tasks complete and data analysis.

• Other proteins are classified by enzyme code.

36

Page 37: Cheminformatics platform for drug discovery application · 2011-04-18 · EGI-InSPIRE RI -261323 EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang

www.egi.euEGI-InSPIRE RI-261323

Thank you for your attention