sinnott paper

20
The e-Context of ENROLLER Prof Richard O. Sinnott Technical Director National e-Science Centre [email protected] 16 th April 2010

Upload: johanna-green

Post on 23-Jan-2015

549 views

Category:

Education


0 download

DESCRIPTION

An Introduction to eScience and the Grid by Prof. Richard Sinnott.

TRANSCRIPT

Page 1: Sinnott Paper

The e-Context of ENROLLER

Prof Richard O. SinnottTechnical Director National e-Science Centre

[email protected]

16th April 2010

Page 2: Sinnott Paper

e-Science and e-Research• Goal: to enable better research in all disciplines• Method: Develop collaboration supported by advanced

distributed computation– to generate, curate and analyse rich data resources

• From experiments, observations and simulations• Quality management, preservation and reliable evidence

– to develop and explore models and simulations• Computation and data at all scales• Trustworthy, economic, timely and relevant results

– to enable dynamic distributed collaboration• Facilitating collaboration with information and resource sharing• Security, trust, reliability, accountability, manageability and agility

The challenge is to develop an integrated approach to all three

Often realised through Grids and Grid infrastructures

Page 3: Sinnott Paper

The Grid Context• There are many Grids

– Data Grids, Compute Grids, Information Grids, Enterprise Grids, …• There are many ways to build Grids

– Grid middleware (many flavours), – Web services, – Clouds, – Web2.0, – internet computing, …

• There are many moving targets– changing middleware, changing standards, changing sciences, changing resources,

new questions, new funding streams…• There has been a lot of hype• There has been a lot of money invested• There are lots of projects and big scientific challenges• There is an urgent need to build user communities• There needs to have much more research pull than middleware push

– … there are many more things that could go here!

Page 4: Sinnott Paper

UK e-Science Core Programme• Major cross council initiative

– AHRC, BBSRC, EPSRC, ESRC, MRC, NERC, PPARC/STFC, …

• Over £250m funding over 7-8 years from 2001 – Does not include industry monies from

• Department of Trade and Industry • Technology Strategy Board• Europe• JISC• Regional development agencies• …

• Programme now completed and reviews/planning for future government spending in this area on-going

Page 5: Sinnott Paper

CeSC (Cambridge)

e-Science Institute

e-Science in the UK

Grid Operations

SupportCentre

National Institutefor Environmental

e-Science

NeSC4th Phase Platform

Grant

Core NGS Nodes +HECTOR

+partners/affiliates(HECTOR

investment £113m)

Digital Curation Centre

Digital Curation Centre

Digital Curation Centre

Digital Curation Centre

OMII-UKOMII-UKOMII-UK

NERCe-Science

Centre

NationalCentre for

Text Mining

NationalCentre fore-SocialScience

Software Sustainability

Institute

Core NGS Nodes +HECTOR

+partners/affiliates(HECTOR

investment £113m)

Core NGS Nodes +HECTOR

+partners/affiliates(HECTOR

investment £113m)

NationalCentre fore-SocialScience

NationalCentre fore-SocialScience

National Data Centres+ UK Federation

+ International dimensionincluding EGEE/EGI

+ SuperJanet+ Training/Education

+…

Page 6: Sinnott Paper

NeSC Background• E-Science Hub– Externally

• Glasgow end of NeSC– Involved in numerous UK wide activities/projects

– Internally • Focal point for e-Science research/activities at Glasgow• Work closely with foundation departments

– Department of Computing Science» Established first UK Grid Computing course

– Department of Physics & Astronomy• Also working with other groups including

– Bioinformatics Research Centre, – Biostatistics– Electronics and Electrical Engineering– Dept of Public Health, Dept. of Pathology,– Dept. of English, Arts & Humanities, – University Services,– Clinicians & numerous hospitals across Scotland,

» Yorkhill, Royal Infirmary, Western General, Southern General …

– NeSC GU now part of University IT Services

J. Jiang Chris Bayliss

David Martin(ScotGrid sys-admin)

C. MillarGordon Stewart

J.Mohammad(PhD)

T.DohertyVPman

S. Hussain(PhD)

M. Sarwar(ENROLLER)

Nurazian Mior Dahalan (PhD)

CameraShy

Page 7: Sinnott Paper

NeSC Glasgow Projects• National e-Science Centre (NeSC-I, NeSC-II, NeSC-III) • Dynamic Virtual Organisations for e-Science Education (DyVOSE)• Biomedical Research Informatics Delivered by Grid Enabled Services (BRIDGES)• Grid Enabled Microarray Expression Profile Search (GEMEPS) • GridNet• Glasgow early adoption of Shibboleth (GLASS) • Joint Data Standards Survey (JDSS) • ESP-Grid• GridNet-2 • HPC Compute cluster award• Sun industrial sponsorship • OGC Collision • OMII-Security Portlets• OMII-RAVE• Integrating VOMS and PERMIS for Superior Grid Authorization (VPman)• NCeSS Technical Management • CESSDA PPP• Pharming of Therapeutic RNA• Grid Enabled Occupational Data Environment (GEODE)• Towards an e-Infrastructure for e-Science Digital Repositories• Grid enabled Biochemical Pathway Simulator• Virtual Organisations for Trials and Epidemiological Studies (VOTES)• Towards a European e-Infrastructure for e-Science Repositories• Modelling, Inference and Analysis for Biological Systems up to the Cellular Level• Drug Discovery Portal• Advanced Grid Authorisation through Semantic Technologies (AGAST)• ShinTau (Supporting Multiple Shibboleth Attribute Authorities)• Grid-enabled Virtual Safe Settings – Security & the State of the Nation

• Scottish Bioinformatics Research Network (SBRN) • Generation Scotland Scottish Family Health Study • Meeting the Design Challenges of nanoCMOS Electronics

(nanoCMOS)• EU FW7 Avert-IT• EU FW7 EuroDSD• Breast Cancer Tissue Biobank• Data Management through e-Social Science (DAMES)• NeSC Research Platform (NRP)• NeSC Information Network (NIN)• European Network for Study of Adrenal Tumors• Scottish Health Informatics Platform for Research (SHIP)• National E-Infrastructure for Social Simulation (NeISS)• Enhancing Repositories for Language and Literature

Researchers (ENROLLER)• Proxy Credential Auditing Infrastructure for the NGS• European Network for Study of Adrenal Tumors Cancer

Research Platform• Diagnostic Identification of Parkinsons (DiPAR)

Completed Running

Page 8: Sinnott Paper

Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Page 9: Sinnott Paper

Next Generation Transistor Design

3D+

Statistical

Page 10: Sinnott Paper

Inter-disciplinary e-Health ExampleN

ucl

eoti

de

seq

uen

ces

Nu

cleo

tid

e st

ruct

ure

s

Gen

e ex

pre

ssio

ns

Pro

tein

Str

uct

ure

s

Pro

tei n

fu

nct

ion

s

Pro

tein

-pro

tein

inte

ract

ion

(p

ath

way

s)

Cel

l

Cel

l sig

nal

lin

g

Tis

sues

Org

ans

Ph

ysio

logy

Org

anis

ms

Pop

ula

tion

s

Security!!!

biologists, bioinformaticians,

statisticians, clinicians,pharmacists, physicists,

epidemiologists,chemists, geospatial

modellers, public health...

+ environmental, social, geographic …+ environmental, social, geographic …

Page 11: Sinnott Paper

Bridges Project

Glasgow Edinburgh

Leicester Oxford

London

Netherlands

Publically Curated Data

Private data

Private data

Private data

Private data

Private data

Private data

CFG Virtual Organisation Ensembl

MGI

HUGO

OMIM

SWISS-PROT

… DATA HUB

RGD

SyntenyService

Information Integrator

OGSA-DAIMagna Vista Service

VO Authorisation

blast

+ + +

Page 12: Sinnott Paper

Grid Blast Interface

• Allows ‘genome scale’ blasting

• Transparently uses NGS, ScotGrid, other GU clusters, Condor pools

• Many databases already deployed across nodes

• No user certificates

• Fine grained security at back-end

Page 13: Sinnott Paper

www.nesc.ac.uk

MagnaVista

Page 14: Sinnott Paper

MagnaVista

Page 15: Sinnott Paper

GeneVista

Page 16: Sinnott Paper

E-Security• Security

– Key is that should support • seamless access to a heterogeneous variety of “distributed” compute and

data (and other) resources– Often domain specific – especially data!

• single sign-on– Authenticate once and access numerous distributed resources

–AAAA (+privacy, confidentiality, integrity…)

– Authentication » (know who “they” are)

– Authorisation » (decide what “they” can do and enforce it)

– Auditing/accounting » (keeping track of who did what/when for security checks/charging etc)

Page 17: Sinnott Paper

Ease of Use• For Grids/e-Research to be truly successful

– have to be made as seamless to access and use as the internet

• Forget training, education for some (most?) users!

– have to be based on research pull and not middleware push

– experiences in various projects and across whole e-Science programme have shown that users don’t like digital certificates

Page 18: Sinnott Paper

User Oriented Security• A_ _ _

– Federated Authentication, e.g. through Shibboleth

Service provider

5. User accesses resource

Web site/e-Journal

Identity Provider

Home Institution

W.A.Y.F.

Federation

User1. User points browser at Grid

resource/portal (or non-Grid resource)

2. Shibboleth redirects

user to W.A.Y.F. service

3.User selects their

home institution

4. Home site authenticates user

AuthNLDAP

Log-in once and roam

Page 19: Sinnott Paper

_ A _ _• Authorisation

– Defining what they can do and define and enforce rules• Each site will have different rules/regulations

– Also known as Virtual Organisations (VO)• Collection of distributed resources shared by collection of users from one or

more organizations typically to work on common research goal– Provides conceptual framework for rules and regulations for resources to be

offered/shared between VO institutions/members – Different domains place greater/lesser emphasis on expression and enforcement of

rules and regulations (policies)

. . .

{Resources} {Users}

Org1

{Resources} {Users}

Orgn

VO

Page 20: Sinnott Paper

Privileges, Resources, Access Control and Trust

Service provider

ShibFrontend

5. Pass authentication info and attributes to authZ function

Grid Portal

6. Make final AuthZ decision

Grid Application

Identity Provider

Home Institution

W.A.Y.F.

Federation

User1. User points browser at Grid

resource/portal

2. Shibboleth redirects

user to W.A.Y.F. service

3.User selects their

home institution

4. Home site authenticates user and

pushes attributes to the service provider

AuthNLDAP

LDAPAuthZ