release of agris 2.0: searching agricultural bibliografic data
Upload: aims-agricultural-information-management-standards-fao-of-the-un
Post on 27-Jan-2015
108 views
DESCRIPTION
The objective of this presentation is to give an introduction to the new AGRIS website (that was released on the 2nd of December 2013) and its functions. The new website merges the AGRIS system with OpenAGRIS and provides a simple access point to search bibliographic data in the AGRIS data base.TRANSCRIPT
Presentations by AIMS is licensed under a Creative Commons Attribution-NonCommercial-
ShareAlike 3.0 Unported License.
Fabrizio CelliJohannes Keizer
AGRIS – exploiting bibliographic records to create rich Linked Open Data pages
AIMS Webinar
http://agris.fao.org 2
Outline
AGRIS network and dataflow
Data Consumption• Centralization
• Interlinking
Provenance
http://agris.fao.org
AGRIS
The AGRIS database is a collection of more than 7.7 million bibliographic records in the agricultural domain
They are enhanced by the AGROVOC thesaurus, which is extensively used by cataloguers to enrich data indexing in agricultural information systems
AGRIS is an RDF-aware system, a mashup application that allows users to query the AGRIS-RDF content, interlinking all records to external sources of information
7 million bibliographic records become 7 million mashup pages!
http://agris.fao.org
AGRIS data consumption
Centralization: bibliographic references in the AGRIS domain (agriculture, forestry, animal husbandry, aquatic sciences and fisheries, and human nutrition)
Interlinking: other kinds of information related to the AGRIS domain (statistics, maps, country profiles, etc.)
http://agris.fao.org
Data consuming
AGRIS consumes metadata provided by the community and publishes it as open data
The metadata is captured either by pulling data through harvesting from clients (e.g. aggregators, institutional repositories, using protocols such as OAI-PMH)
or by pushing data to AGRIS from clients (e.g. national libraries or journal publishers)
http://agris.fao.org
Interoperability -Accept any input format!
http://agris.fao.org
AGRIS data flow
http://agris.fao.org
Centralization: Data processing
Metadata are randomly manually checked to look for inconsistencies or recurring semantic errors
Input format is mapped to AGRIS RDF
Metadata are converted to AGRIS RDF, running the AgroTagger when Agrovoc keywords are not available
Before adding metadata to the triplestore and indexing them in the Solr index, duplicates are detected and managed, as the same record may be indexed in multiple collections or be duplicated in the same repository
http://agris.fao.org
AgroTagger
Not yet implemented
Maui is named after the
Polynesian mythological hero
and demi-god, which would
transform himself into different kinds of birds to perform
many of his exploits.
http://agris.fao.org
RDF-ization
bibo:Articlebibo:abstractbibo:doibibo:isbnbibo:presentedAt -> bibo:Conference -> dct:titlebibo:uridct:alternativedct:creator -> foaf:organization -> foaf:namedct:creator -> foaf:Person -> foaf:namedct:dateSubmitteddct:descriptiondct:extentdct:identifierdct:language
dct:isPartOfdct:issueddct:publisher -> foaf:Organization -> foaf:namedct:sourcedct:subjectdct:titledct:typedct:rights
Choose of vocabularies and mapping!
http://agris.fao.org
RDF/XML snapshot
http://agris.fao.org
Provenance
Each AGRIS record has an identifier (ARN), which has a predefined structure and contains information on the data source together with the bibliographic record’s year of creation
“IT 2008 0 00091” refers to a record created in 2008 from a specific AGRIS data provider in Italy, whose progressive number is 91
Data providers information are stored in the CIARD RING and triplified in the AGRIS centers dataset (each data provider has its own unique URI)
http://agris.fao.org
Storage system
AGRIS RDF is stored in Malaysia, at MIMOS (http://www.mimos.my/ )
Triples are managed by Allegrograph triplestore (http://www.franz.com/agraph/allegrograph/)
A 90GB machine is dedicated to the triplestore. Some month ago we used a 32 GB machine, but Allegrograph once a month (at least) went down (pending processes, memory problems)
We did tests with OWLIM and we could move to this triplestore, or find another kind of solution
http://agris.fao.org
Interlinking
Agrovoc is the backbone
Align Agrovoc to other thesauri (skos:exactMatch, skos:closeMatch)
Discover Sparql endpoints
Discover Webservices and APIs
Write the code and interlink!
http://agris.fao.org
The IFPRI case
A user queries the system
AGRIS record with Agrovoc
keywords
At least one Agrovoc keyword is a Country
name
The system queries IFPRI sparql endpoint (http://data.ifpri.org/sparql/ ) to retrieve the global hunger index (GHI) and the child mortality rate related to the Country
http://agris.fao.org
Some numbers (02/12/2013)
7,636,069 bibliographic records
187,238,716 triples in the AGRIS records datasethttp://202.45.142.113:10035/repositories/agris
372,462 triples in the AGRIS serials datasethttp://202.45.142.113:10035/repositories/jad
11,414 triples in the AGRIS centers datasethttp://202.45.142.113:10035/repositories/centers
http://agris.fao.org
AGRIS RDF RECORD
AGROVOC
Thank you !