exploring ‘workspaces’

Post on 24-Feb-2016

55 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Exploring ‘Workspaces’. Tom Visser, SARA compute and networking services, Amsterdam Garching Workshop 21 st September 2010. Background Overview of cases Technical possibilities Opportunities and risks Expected results Proposed approach. The CLARIN-NL connection. - PowerPoint PPT Presentation

TRANSCRIPT

Exploring ‘Workspaces’

Tom Visser, SARA compute and networking services, Amsterdam

Garching Workshop 21st September 2010

• Background• Overview of cases• Technical possibilities• Opportunities and risks • Expected results• Proposed approach

The CLARIN-NL connection• Seeking to create an infrastructure for language

resources• Providing access to tools and technologies• CLARIN-NL and BiG Grid are exploring possibilities• The WHOLE pipeline

– Creating– Curation– Collecting– DO SCIENCE– Depositing

Already• SARA has developed a client implementation of a

Persistent Identifier Service (HANDLE) and has become an EPIC consortium member

• Instance of service currently hosted at SARA• BiG Grid / SURFNET pilot with Short lived

credential service• Activities with Computational Linguistics (e.g.

Named Entity Recognition) & forthcoming Computational Humanities institute (KNAW)

• Series of workshop to find a common ground between BiG Grid and the CLARIN infrastructure

Questions of today• When is a user workspace service?• Why do we need user workspaces?• What are their characteristics in a distributed

environment?• How do we support processing chains in

distributed environments driven by community environments

• Are there generic frameworks for the execution of distributed processing chains and deployment of web-services

Core problems• Where to store • How to store• How to access• How to foster collaboration amongst people• How to support: Data discovery, exploration and

exploitation• How to realize such a service• What SLA / service description / responsibilities

What it should be• A temporary storage place (days, weeks, years)

– Global home / global scratch– A ‘logical mount point’

• Accessible by web services• Meaningfully accessible by a human• Autonomy to communities

– Instantiate– Content– Control

• Identifiable• Store digital objects and metadata• Journaling (register interactions)

• Create• Read• Write• Update• Grant access to (Authorization)• List contents• Search contents

– Adopting & offering known best practices and services in the ecosystem

• …

Considered technical possibilities

• iRODS• Cloud platform (SNIA/CDMI)• HADOOP implementation• AMAZON S3 / OpenCloud / Azure /

Risks and opportunities• Creating something that is only generic - specific• Looking uphill, but what will you know when

you’ve climbed the hill• Knowledge of the community• Epistemological problems• Bootstrapping• Trust

• Proces focus: we are starting a small scale pilot within 1 month, short iterations, keeping everyone involved.

Approach: BiG Grid and Dutch partners

• Many interesting addressable cases– Keyword extraction from dutch audio and film institute– MPI video repository annotations– City of Den Haag government proceedings: minutes and

video alignment (feature extraction)– OCR & Machine learning on dutch handwritings

• Expected results– Common understanding of a workspace service– Bootstrap implementation vertically crossing all layers

• When is a user workspace service?– When it is used and has become an indispensible tool

• Why do we need user workspaces?– To be able to flexibly work with data– Initiate collaborations– Have a trustable storage resource availble

• What are their characteristics in a distributed environment?– Clear core functionality, many service providers, integration

with identity providers • How do we support processing chains in distributed

environments driven by community environments– By having open, known, and easily accessible services

• Are there generic frameworks for the execution of distributed processing chains and deployment of web-services– Yes!

THANK YOU

top related