1 2010-04-27 g. terstyanszky, t. kukla, t. kiss, s. winter, j.: centre for parallel computing school...

20
1 2010-04-27 G. Terstyanszky, T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing School of Electronics and Computer Science, University of Westminster London, United Kingdom J. Kovacs, Z. Farkas, P. Kacsuk MTA-SZTAKI Budapest, Hungary, Combining Desktop and Service Grids to Support Combining Desktop and Service Grids to Support e-Scientists to Run Simulations e-Scientists to Run Simulations European Desktop Grid Infrastructure = EDGI European Desktop Grid Infrastructure = EDGI

Upload: terence-snow

Post on 31-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

1

2010-04-27

G. Terstyanszky, T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing

School of Electronics and Computer Science,University of WestminsterLondon, United Kingdom

J. Kovacs, Z. Farkas, P. Kacsuk MTA-SZTAKI

Budapest, Hungary,

Combining Desktop and Service Grids to Support Combining Desktop and Service Grids to Support

e-Scientists to Run Simulationse-Scientists to Run Simulations

Combining Desktop and Service Grids to Support Combining Desktop and Service Grids to Support

e-Scientists to Run Simulationse-Scientists to Run Simulations

European Desktop Grid Infrastructure = EDGIEuropean Desktop Grid Infrastructure = EDGIEuropean Desktop Grid Infrastructure = EDGIEuropean Desktop Grid Infrastructure = EDGI

22

Binding pocket

Sugar (ligand)

Protein (receptor)

Docking and Molecular Dynamics Simulations

3

Docking and Molecular Dynamics Simulations

In-vitro (or wet lab) research In-vitro (or wet lab) research

• It investigates components of an organism that have been isolated from their usual biological surroundings in order to permit a more detailed and convenient analysis than can be done with whole organisms.

In-silico simulationIn-silico simulation• It simulates components of an organism for example docking

of ligands and proteins downloading them from public libraries, binding them and analysing the properties of the compound molecules.

Aims of in-silico docking simulationAims of in-silico docking simulation• Understanding how pathogens bind to cell surface proteins

can lead to the design of carbohydrate-based drugs and diagnostic and therapeutic agents

• Highlighting potential novel inhibitors and drugs for in vitro and on-chip testing.

4

• Advantages of in-silico methods:• Reduced time and cost

• In vitro experiments are expensive• Better focusing wet laboratory resources:

• Better planning of experiments by selecting best molecules to investigate

• Increased number of molecules screened

• Problems of in-silico experiments:• Time consuming

• Weeks or months on a single computer• Simulation tools are too complex for an average bio-scientist

• Linux command line interfaces• Bio-molecular simulation tools are not widely tested and validated

• Are the results really useful and accurate?

Docking and Molecular Dynamics Simulations

5

In-silico Simulation in Service Grids

PDB file 1(Receptor) PDB file 2

(Ligand)

Energy Minimization(Gromacs)

Validate(Molprobity)

Check(Molprobity)

Perform docking(AutoDock)

Molecular Dynamics(Gromacs)

Phase 1

Phase 2

Phase 3

Phase 4

6

phase 1 – pre-processing of protein

phase 2 – pre-processing of sugar

phase 3 – docking

phase 4 – molecular dynamics simulation

•Executed on 5 different sites of the UK NGS

•Parameter sweeps in phase 3 and 4

•MPI in phase 4

Phase 1

Phase 2

Phase 4

Phase 3

Phase 1

Phase 2

Phase 4

Phase 3

In-silico Simulation in Service Grids

7

2010-04-27

EDGI InfrastructureEDGI InfrastructureEDGI InfrastructureEDGI Infrastructure

8888

2010-04-27

Usage Scenario in Desktop – Service Grids Usage Scenario in Desktop – Service Grids

EDGI Portal

SG Broker

Compute Element(n)

SG->DGBridge

Desktop Grid

Server

EDGIApplication Repository

Service Grid

Desktop Grid

Compute Element(2)

Compute Element(1)

Worker Node(m)

Worker Node(2)

Worker Node(1)

search, select & download application’s

implementation

submit application’s

implementation

retrieve & deploy

impl

e-scientistDG

admin

query implementation

EDGI Application Repository: Actors, Entities EDGI Application Repository: Actors, Entities and Operationsand Operations

EDGI Application Repository: Actors, Entities EDGI Application Repository: Actors, Entities and Operationsand Operations

user /group man

platform man.

upload appl.

mark appl valid

browse/search appl.

download appl.

E-scientists x x

Application Developers

x x x x

Application Validators

x x

Desktop Grid Administrators

x x

Repository Administrators

x x x x x x

with registration without registration

Repository EntitiesApplication represents an application which implementations can be executed on the EDGI

infrastructure. It describes the inputs and outputs and explains what the application does.Implementation is an application implementation. It contains references (via e.g. URLs) to all the

files and data necessary to run the application on a given platform and metadata. Platform describes desktop Grid and/or service Grid environment where the implementation can

be executed.Configuration contains the implementation files required to run the applications.

Repository Actors and Operations

10

Main menu: select users & groups + applications (implementations) + platforms + validation pages

Action menu: create/delete entities + upload/download applications & implementations add/edit/remove metadata

Search: users & groups + applications & implementations + platforms

EDGI Application Repository: User Interface

11

EDGI Application Repository: Application Metadata

12

EDGI Application Repository: Implementation Metadata

1313

2010-04-27

EDGI Application Repository in the EDGI EDGI Application Repository in the EDGI InfrastructureInfrastructure

EDGI Portal

SG Broker

Compute Element(n)

SG->DGBridge

Desktop Grid

Server

EDGIApplication Repository

Service Grid

Desktop Grid

Compute Element(2)

Compute Element(1)

Worker Node(m)

Worker Node(2)

Worker Node(1)

search, select & download application’s

implementation

submit application’s

implementation

retrieve & deploy

impl

e-scientistDG

admin

query implementation

1

2

34

5

6

DG clients:New Cavendish St 576

nodesMarylebone Campus 559 nodesRegent Street 395 nodesWells Street 31 nodesLittle Titchfield St 66

nodesHarrow Campus 254

nodes

Lifecycle of a DG node: 1. PCs basically used by students/staff2. If unused, switch to Desktop Grid

mode3. No more work from DG server ->

shutdown (green solution)

University of Westminster Local Desktop GridUniversity of Westminster Local Desktop GridUniversity of Westminster Local Desktop GridUniversity of Westminster Local Desktop Grid

15

gpf file

pdb file (ligand)

pdb file (receptor)

prepare_ligand4.py

prepare_receptor4.py

pdbqt file

pdbqt file

AUTOGRID

AUTODOCK

map files

Bio Scientist

dpf file

AUTODOCK

AUTODOCK

AUTODOCK

AUTODOCK

dlg files

SCRIPT1SCRIPT

2best dlg files pdb file

In In SilicoSilico Docking User Scenario Docking User Scenario

Research objectives:•Constructing a library of tens of thousands of small molecule candidates available in databases (eg. DrugBank) and preparing PDBQT files •To be screened against known targets using Autodock Vina•Small molecule library will be made available to other researchers

• Promising candidates can be validated in vitro

16

In-Silico Docking WorkflowIn-Silico Docking Workflow

receptor.pdb

ligand.pdb

Autogrid executables, Scripts (uploaded by thedeveloper , don’t change it)

gpf descriptor file

dpf descriptor file

output pdb file

The The Generator job Generator job creates specified numbered creates specified numbered of AutoDock jobs.of AutoDock jobs.

The The AutoGrid job AutoGrid job creates pdbqt files from the creates pdbqt files from the pdb files, runs the autogrid application and pdb files, runs the autogrid application and generates the map files. Zips them into an generates the map files. Zips them into an archive file. This archive will be the input of archive file. This archive will be the input of all AutoDock jobs.all AutoDock jobs.

The The AutoDock jobs AutoDock jobs are running on the Desktop are running on the Desktop Grid. As output they provide dlg files.Grid. As output they provide dlg files.

The The Collector job Collector job collects the dlg files. Takes collects the dlg files. Takes the best results and concatenates them the best results and concatenates them into a pdb file.into a pdb file.

dlg files

number of work

units

1717

• Free access to pre-deployed molecular docking “primitive” scenarios running on the EDGI infrastructure Random blind docking and virtual screening

• DG versions of applications are coming from the EDGI AR • Docking workflows are executed on the EDGeS@home Desktop Grid

EDGI Docking PortalEDGI Docking Portal

18

Docking the Protozoan NeuraminidaseDocking the Protozoan Neuraminidase

19

Docking the Protozoan NeuraminidaseDocking the Protozoan Neuraminidase

20

Computer ScientistsComputer Scientists

• They created the combined desktop grid and service grid They created the combined desktop grid and service grid infrastructure where e-scientists can run their application oninfrastructure where e-scientists can run their application on

• The EDGI Application Repository and Portal is able to support The EDGI Application Repository and Portal is able to support application developers, e-scientists and application validatorsapplication developers, e-scientists and application validators

Bio ScientistsBio Scientists

• The EDGI infrastructure can provide potential for unlimited The EDGI infrastructure can provide potential for unlimited computational power to the biologistscomputational power to the biologists

• They can offer access to methodology (application porting) They can offer access to methodology (application porting) and tools (portal and repository)and tools (portal and repository)

• They have a library of small molecules available for screening They have a library of small molecules available for screening and access to Chip based technologyand access to Chip based technology

ConclusionsConclusions