science gateways for life sciences – balancing usability and re-usability
TRANSCRIPT
Science Gateways for Life Sciences – Balancing Usability and Re-‐Usability
Sandra Gesing Center for Research Compu?ng
[email protected] 19 September 2013
Sandra Gesing
Life Sciences
2 Science Gateways for Life Sciences
“the sciences concerned with the study of living organisms, including biology, botany, zoology, microbiology, physiology, biochemistry, and related subjects” hMp://www.thefreedic?onary.com
• Technologies and methods for crea?ng, analyzing and predic?on of data available
• Immense amount of data, e.g., • ZINC database: ~20 Mio molecular structures • Human genome: ~ 3 Bio DNA base pairs
Sandra Gesing
Life Sciences and Computa?on
3 Science Gateways for Life Sciences
February 16, 2001 biotech company Celera
The Genomics Boom
February 15, 2001 The Human Genome Project
Sandra Gesing
Life Sciences and Computa?on
4 Science Gateways for Life Sciences
The Genomics Boom
Craig Venter (le`) and Francis Collins (right)
Sandra Gesing
Areas in the Life Sciences
5 Science Gateways for Life Sciences
• A lot of “omics” sciences, e.g. • Genomics • Proteomics
Black Swallowtail -‐ larvae and buMerfly
Sandra Gesing
Molecular Simula?ons and Docking
6 Science Gateways for Life Sciences
• Predic?on and analysis of molecular structures • Numerous applica?ons, e.g. • Materials science • Drug design
ligands target
docking ?
Sandra Gesing
Molecular Simula?ons and Docking
7 Science Gateways for Life Sciences
ligands target
docking binding energy scoring func?ons binding
• Predic?on and analysis of molecular structure • Numerous applica?ons, e.g. • Materials science • Drug design
Sandra Gesing
Simula?ons
8 Science Gateways for Life Sciences
• Basic data with heterogeneous provenance, e.g. Research in Malaria • Data on weather • Data on demography • Data on interven?ons • ...
• Mathema?cal models needing a baseline • Predic?on of interven?ons
Sandra Gesing
State-‐of-‐the-‐art
9 Science Gateways for Life Sciences
• Data intensive and compute intensive problems • Sophis?cated tools and methods available • Distributed data management available • DCIs (Distributed Compu?ng Infrastructures) available
Why do researchers not use the tools and distributed
environments on a large scale?
Sandra Gesing
Open Issues
10 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces
Sandra Gesing
Open Issues
11 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces
Sandra Gesing
Open Issues
12 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces • Workflows
a sequence of connected steps in a defined order based on their control and data dependencies
Sandra Gesing
Open Issues
13 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces • Workflows
a sequence of connected steps in a defined order based on their control and data dependencies
12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
Slide copied from: Stuart Owen „Workflows with Taverna“
Sandra Gesing
Open Issues
14 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces • Workflows • Complexity of infrastructures • Users are generally not IT specialists
Sandra Gesing
Open Issues
15 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces • Workflows • Complexity of infrastructures • Users are generally not IT specialists
Sandra Gesing
Open Issues
16 Science Gateways for Life Sciences
• Usability of tools o`en limited • Complexity of methods • Lack of graphical user interfaces • Workflows • Complexity of infrastructures • Users are generally not IT specialists ⇒ User interfaces need to be intui8ve and self-‐ explanatory
⇒ Science gateways
Sandra Gesing
Science Gateways
17 Science Gateways for Life Sciences
“A Science Gateway is a community-‐developed set of tools, applica9ons, and data that is integrated via a portal or a suite of applica9ons, usually in a graphical user interface, that is further customized to meet the needs of a specific community.” TeraGrid/XSEDE
Community
Sandra Gesing
Web-‐based Science Gateways
18 Science Gateways for Life Sciences
• Single point of entry • Possibility to customize views and tools • Store user preferences • No installa?on of so`ware on the user’s side • No firewall issues
Slar9barGast: “I must warn you, we're going to pass through, well, a sort of gateway thing.” Arthur Dent: „What?“ Slar9barGast: “It may disturb you. It scares the willies out of me.” (Douglas Adams in “The Hitchhiker's Guide to the Galaxy”)
Sandra Gesing
Goal of Science Gateways
19 Science Gateways for Life Sciences
Usability of so`ware "AOer all, usability really just means that making sure that something works well: that a person … can use the thing -‐ whether it's a Web site, a fighter jet, or a revolving door -‐ for its intended purpose without geSng hopelessly frustrated." (Steve Krug in “Don't make me think!: A Common Sense Approach to Web Usability”, 2005)
Sandra Gesing
Re-‐Usability
20 Science Gateways for Life Sciences
• Sharing of knowledge and data • Re-‐Using of „recipes“ and workflows • Re-‐Usability of so`ware “The key to produc9vity is reusability. The easiest way to produce code is obviously to have it already!" (John R. Bourne in “Object-‐oriented Engineering: Building Engineering Systems Using Smalltalk-‐80”, 1992)
Sandra Gesing
Re-‐Usability
21 Science Gateways for Life Sciences
Re-‐inven?ng is not always necessary...
Sandra Gesing
Re-‐Usability
22 Science Gateways for Life Sciences
... but the model should fit to the demands of the community
Sandra Gesing
Diverse Approaches
23 Science Gateways for Life Sciences
• Science gateway frameworks • Sta?c layout • Layout extendable • Workflow-‐enabled
• Portal frameworks • Content management systems • Libraries for implementa?on
Sandra Gesing
Galaxy
24 Science Gateways for Life Sciences
Python framework
Sandra Gesing
Galaxy
25 Science Gateways for Life Sciences
Sandra Gesing
Parametriza?on
26 Science Gateways for Life Sciences
Sandra Gesing
Workflows
27 Science Gateways for Life Sciences
Sandra Gesing
Administra?on
28 Science Gateways for Life Sciences
Sandra Gesing
WS-‐PGRADE
29 Science Gateways for Life Sciences
User Interface WS-‐PGRADE
Liferay
DCI Resources Middleware Layer
High-‐Level Middleware Service Layer
gUSE
Sandra Gesing
WS-‐PGRADE
30 Science Gateways for Life Sciences
Sandra Gesing
Job Configura?on
31 Science Gateways for Life Sciences
Sandra Gesing
Monitoring
32 Science Gateways for Life Sciences
Sandra Gesing
MoSGrid
33 Science Gateways for Life Sciences
Molecular Simula?on Grid • Science gateway integrated with underlying compute and data management infrastructure • Distributed workflow management • Data repository
Sandra Gesing
Tools
34 Science Gateways for Life Sciences
Sandra Gesing
File Management
35 Science Gateways for Life Sciences
Sandra Gesing
MoSGrid – Applica?on Areas
36 Science Gateways for Life Sciences
Molecular Dynamics • Study and simula?on of molecular mo?on Quantum Chemistry • Study and simula?on of molecular electronic behavior rela?ve to their chemical reac?vity Docking • Main focus on evalua?on of ligand-‐receptor interac?ons (e.g., for drug design)
Sandra Gesing
MD Portlet
37 Science Gateways for Life Sciences
Sandra Gesing
MD Portlet
38 Science Gateways for Life Sciences
Sandra Gesing
MD Portlet
39 Science Gateways for Life Sciences
Sandra Gesing
QC Portlet
40 Science Gateways for Life Sciences
Sandra Gesing
QC Portlet
41 Science Gateways for Life Sciences
Sandra Gesing
Docking Portlet
42 Science Gateways for Life Sciences
Sandra Gesing
Docking Portlet
43 Science Gateways for Life Sciences
Sandra Gesing
VectorBase
44 Science Gateways for Life Sciences
Sandra Gesing
Tools
45 Science Gateways for Life Sciences
Sandra Gesing
Tools
46 Science Gateways for Life Sciences
Sandra Gesing
Tools
47 Science Gateways for Life Sciences
Sandra Gesing
VECNet
48 Science Gateways for Life Sciences
Sandra Gesing
Data
49 Science Gateways for Life Sciences
Sandra Gesing
Modeling Plarorm
50 Science Gateways for Life Sciences
Sandra Gesing
Risk Mapper
51 Science Gateways for Life Sciences
Sandra Gesing
Risk Mapper
52 Science Gateways for Life Sciences
Sandra Gesing
Usability vs. Re-‐Usability
53 Science Gateways for Life Sciences
• User side • Methods • Workflows • Data è Re-‐usability increases usability on the user side
• Admin/Developer side • Frameworks • Libraries • Source code è Usability and re-‐usability depend on support,
documenta?on and scalability
Sandra Gesing
Usability vs. Re-‐Usability
54 Science Gateways for Life Sciences
• User side • Layout • Visualiza?on • Security è Re-‐used parts may be not sufficient, usability depends on the features needed in the community
• Admin/Developer side • Integra?on with compu?ng and data infrastructures • Security è Usability and re-‐usability depend on available infrastructures
Sandra Gesing
New Science Gateway -‐ Checklist
55 Science Gateways for Life Sciences
• Demands of the user community on the user interface • Demands on security • Demands on compu?ng and data resources • Workflows • Performance
• Exis?ng tools and models • Available underlying infrastructure • Available documenta?on and support • Effort on development and maintenance
Sandra Gesing 56 Science Gatewazs for Life Sciences