slide 1 aspire stakeholder workshop brussels thursday 13 september 2012 rosette vandenbroucke hpc...

19
Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012 www.terena.org/aspire Rosette Vandenbroucke HPC Coordinator [email protected] e Middleware and Managing Data and Knowledge in a Data-rich World

Upload: blaise-morton

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 1

ASPIRE STAKEHOLDER WORKSHOPBrussels

Thursday 13 September 2012

www.terena.org/aspire

Rosette VandenbrouckeHPC Coordinator

[email protected]

Middleware and Managing Data and Knowledge in a Data-rich World

Page 2: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 2

ASPIRE Data Panel

› Gill Davies – Online music performances› Antonella Fresa - DCH› Jens Jensen - HEP› Andrew Lyall – Biomed› Roshene McCool - Astronomy› Rosette Vandenbroucke

Slide 2

Page 3: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 3

Work method

› Per discipline: List data creation/handling and associated requirements now and in the next 10 years

› Select aspects that are important for the represented disciplines

› Describe important future data and data handling expectations and common requirements

› Formulate recommendations

Slide 3

Page 4: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 4

Aspects and type of data not covered

› Many more data aspects exist› Not possible to handle them all

› Other scientific disciplines› Twitter and blog data› Social sites data› Logs of mobile phone use› ...

Slide 4

Page 5: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 5

Data aspects considered

› NetworkingBandwidth requirements, storage, mirrors, preservation, disaster recovery, costs

› Middleware› Meta data› AAI› Data policies

availability, replication› Data origin

authentication of source, integrity

Slide 5

Page 6: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 6

NetworkingBandwidth (1)

› 3 models observed:SKA/HEP model

Tier structure

HG-DCH modeldata transfer between large centers/depositoriesvery large number of “small” users

Musical Performance modelsmall amount of datanetwork latency important

Slide 6

Page 7: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 7

NetworkingBandwidth (2)

› Shared general concernNetwork links below required bandwidth

- too expensive- network link not available where needed- no permission to connect to the national research network

Cost issues: - bandwidth now available for free may incur tariffs in the future- very high bandwidth and/or dedicated lightpaths requirements can lead to high costs- some regions/countries have more expensive connections- Last mile

Slide 7

Page 8: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 8

NetworkingStorage, mirrors, preservation, Disaster recovery

› Not all data can be stored or preserved› Preservation schemes in study› Replication of data sometimes inherent in the data

structure› Disaster recovery: not often explicitly addressed

Slide 8

Page 9: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 9

Middleware

› Middleware very much discipline specific.› Expectation for generic solutions

Slide 9

Page 10: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 10

Metadata

› Very important› Used by all

› Many standards exist !

› Definition and usage per discipline› No consideration for cross-disciplinary use

Slide 10

Page 11: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 11

AAI

› Everyone agrees about the need for a globally accepted AAI system

› No consensus on how to do

› e-IRG has made recommendations for such an AAI system

› Federations of authentication and eduGAIN are an excellent move in that direction

Slide 11

Page 12: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 12

Data Policies

› Availability of data› Policies on data access discipline specific› General tendency to move to “open data”

› “open data” cannot always be done, due to › the costs of generating the data› The costs of storage and curation› data confidentiality

Slide 12

Page 13: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 13

Data origin

› Integrity and source authentication are important› No general mechanism for data-source

authentication› Metadata can help› In some disciplines data is only relevant to experts,

so considered as quite safe› Authentication by a unique digital signature at

creation

› Source authentication can add costs

Slide 13

Page 14: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 14

DATA

› GROWING in every discipline

putting higher requirements on all aspects we have looked at

Slide 14

Page 15: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 15

Recommendation 1Network related

- Collaboration between user communities and NRENs, GÉANT, ... to understand network requirements associated with the data deluge

- Adequate network services made available timely and economically viable

- All important network parameters have to be studied (speed, throughput, privacy, persistence of connection, cost, ...)

Slide 15

Page 16: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 16

Recommendation 2standardisation of

datasets and metadata› Define standardised data sets:

› To profit from economy of scale fro cross-discipline middleware

› Define standardised data sets, metadata, middleware and applications› For easier accessibility of data

› Adopt a common metadata standard that takes into account multi-disciplinary use of data

Slide 16

Page 17: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 17

Recommendation 3AAI

› Adopt a globally recognised AAI based on standards for the exchange of assertions and security tokens that can be used by all (user communities, e-infrastructure providers, ICT providers, ...)

Slide 17

Page 18: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 18

Recommendation 4Data origin

› Create common mechanisms and procedures for all disciplines to certify and authenticate data.

Slide 18

Page 19: Slide 1 ASPIRE STAKEHOLDER WORKSHOP Brussels Thursday 13 September 2012  Rosette Vandenbroucke HPC Coordinator rosette.vandenbroucke@vub.ac.be

Slide 19

Recommendation 5preservation,

curation › Facilitate collaboration between disciplines to

create common policies, procedures and tools to assist in the curation and preservation of data.

Slide 19