large-scale (meta)data aggregators & infrastructure requirements the case of agriculture nikos...
TRANSCRIPT
Large-scale (meta)Data Aggregators & Infrastructure
Requirementsthe case of agriculture
Nikos ManouselisAgro-Know Technologies & ARIADNE Foundation
[email protected] @eAGE 2012, Dubai, 13/12/12
• Publications, theses, reports, other grey literature• Educational material and content, courseware• Primary data:
– Structured, e.g. datasets as tables– Digitized : images, videos, etc.
• Secondary data (elaborations, e.g. a dendogram)• Provenance information, incl. authors, their
organizations and projects• Experimental protocos & methods• Social data, tags, ratings, etc.
(agricultural) research data
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
educators’ view
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
researchers’ view
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
practioners’ view
• aim is:promoting data sharing and
consumption related to any research activity aimed at improving productivity and quality of crops
ICT for computing, connectivity, storage, instrumentation
data infrastructure for agriculture
• aim is:promoting data sharing and
consumption related to any research activity aimed at improving productivity and quality of crops
ICT for computing, connectivity, storage, instrumentation
data infrastructure for agriculture
Publisher
Date Catalog
SubjectID
AuthorTitle
we actually share metadata
e.g. an educational resource
…metadata reflect the context
…sometimes, data also included
metadata aggregations
• concerns viewing merged collections of metadata records from different sources
• useful: when access to specific supersets or subsets of networked collections– records actually stored at aggregator– or queries distributed at virtually aggregated
collections
12
typically look like this
13 Ternier et al., 2010
typical problem: computing
typical problem: hosting
an ideal scenario
Data provider Data provider in need of in need of hosting & hosting & storage of storage of small-scale small-scale CMSCMS
sets upsets up own own CMS CMS instance instance
Data provider in Data provider in need of large need of large scale hosting & scale hosting & replication CMSreplication CMS
requests space/accounts requests space/accounts in large-scale CMSin large-scale CMS
Data provider Data provider hosting CMS at hosting CMS at own or own or external/commercexternal/commercial infrastructureial infrastructure
interested to expose interested to expose (meta)data to e-(meta)data to e-infrastructure infrastructure
register as data source
register as data source
register as data source
hosted over cloud
hosted over cloud
computed over grid
shares (meta)data shares (meta)data e.g. e.g. through OAI-through OAI-PMHPMH
indexed & available through CIARD RING
shares (meta)data shares (meta)data e.g. e.g. through OAI-through OAI-PMHPMH
shares (meta)data shares (meta)data e.g. e.g. through OAI-through OAI-PMHPMH
(META)DATAAGGREGATOR supported by
scientific gateway
computed & hosted over agINFRA grid/cloud
computed over grid & hosted over cloud
computed over grid
computed over grid
computed over grid & hosted over cloud
……
• its all about efficient metadata management• storage issues: where components are hosted,
how metadata aggregations & their versions handled/stored, scaling up
• computing issues: harvesting takes time/resources and needs to be invoked often, automatic tagging tasks demanding
• often recurring, similar workflows are needed (validate, transform, harvest, auto-tag, index)
overall need
why should you care?
promoting course descriptions
22
• push course information to various syndication/aggregation sites to allow users discover them– OCW search engine
(http://www.ocwsearch.com) – Moodle Hub concept (hub.moodle.org)
including relevant content
23
• allow course creator/author to find relevant material and resources to enrich course– Europeana ingestion widget
(http://wiki.agroknow.gr/agroknow/index.php/Hack4Europe_2012)
• suggest to learners additional courses and material relevant to what they access– Eummena’s Moodle Widget
(http://www.eummena.org/index.php/labs)
developing more end-user services
24
• Web portals to support user communities (e.g. thematic, geographical, social, cultural)– MACE portal (http://portal.mace-project.eu) – Photodentro Greek school collections portal
(http://photodentro.edu.gr) – VOA3R social platform for researchers
(http://voa3r.cc.uah.es)
wrap upwrap up
(META)DATA AGGREGATOR
considerations
• easily replicated cloud-hosted software applications (e.g. DSPACE instances)
• portal/service owners and software developers to use the infrastructure as a basis
• power up existing data & service networks
interesting: TERENA OER pilot
• interconnecting open educational resource repositories of NRENs
https://confluence.terena.org/pages/viewpage.action?pageId=33751325
interesting: GLOBE
• Global Learning Objects Brokering Exchange Alliance
• http://globe-info.org
thank [email protected]
http://wiki.agroknow.grhttp://ariadne-eu.org