agrovoc: fao’s multilingual thesaurus as a building block for linked open data

38
AGROVOC - FAO’s multilingual thesaurus as a building block for linked open data Gudrun JOHANNSEN 1 , Ahsan MORSHED 1 , Sachit RAJBHANDARI 1 , Armando Stellato 3 ,Thomas Baker 2 ,Margherita SINI 1 and Johannes KEIZER 1 1 FAO of the UN, Italy; 2 WC3-SKOS working group, 3 Universitá di Roma “Tor Vergata

Upload: aims-agricultural-information-management-standards

Post on 16-Aug-2015

1.428 views

Category:

Education


3 download

TRANSCRIPT

Page 1: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

AGROVOC -FAO’s multilingual thesaurusas a building block for linked open data

Gudrun JOHANNSEN1, Ahsan MORSHED1, Sachit RAJBHANDARI1, Armando Stellato 3 ,Thomas Baker 2 ,Margherita SINI1 and Johannes KEIZER1

1FAO of the UN, Italy; 2 WC3-SKOS working group, 3Universitá di Roma “Tor Vergata”

Page 2: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

Yes!And Why:

Do we need Thesauri?

Page 3: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Born as tools to assure consistency in the

indexing of library collections

Thesauri were based on “terms”, but terms

represented already concepts in a non

explicit way

Hierarchical and associative relationships

represented generic ontological domain

knowledge

Candidate building blocks for the semantic

web

Thesauri in the past and now

Page 4: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3The Linked Data Universe:

http://www.linkeddata.org

4

Page 5: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

• The Semantic Web isn't just about putting data on the web. It is about making

links, so that a person or machine can explore the web of data. With linked data,

when you have some of it, you can find other, related, data.

• Like the web of hypertext, the web of data is constructed with documents on the

web. However, unlike the web of hypertext, where links are relationships anchors

in hypertext documents written in HTML, for data they links between arbitrary

things described by RDF,. The URIs identify any kind of object or concept. But

for HTML or RDF, the same expectations apply to make the web grow:

• Use URIs as names for things

• Use HTTP URIs so that people can look up those names.

• When someone looks up a URI, provide useful information, using the standards (RDF*,

SPARQL)

• Include links to other URIs. so that they can discover more things.

• Simple. In fact, though, a surprising amount of data isn't linked in 2006, because

of problems with one or more of the steps. This article discusses solutions to

these problems, details of implementation, and factors affecting choices about

how you publish your data.

http://www.w3.org/DesignIssues/LinkedData.html

Page 6: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

•http://www.w3.org/2007/Talks/0221-Bangalore-IH/

RDF as a common format for merging data

Page 7: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Page 8: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3Finding things related to “genes” across

databases

Source: Joanne Luciano, Mitre, and the W3C HCLS IG

Page 9: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Linking DataSets

• Have you seen Thomson Reuters Open

Calais Service? ()

• Standard Vocabularies can become the

glue between different data sets

• (everything linked through http://aims.fao.org/aos/agrovoc?c_2367#concept)

• Our goal: OpenAgro!

Page 10: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

• Name/Identy determination in unstructured

texts

• Using Agrovoc as a controlled vocabulary

• Structured RDF files that can be used to

link data

• Developed by IIT Kanpur for AgroPedia

Indica

• Prototype under testing with excellent

results

AgroTagger

Page 11: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

http://agropedia.iitk.ac.in/auto_tagger/callable_auto_tagger.php

Page 12: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

AGROVOC: From a

traditional thesaurus to

an Agricultural Concept

Scheme

A long and

Winding

Road

Page 13: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

..from thesaurus to Ontologies….

Page 14: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

• We wanted to make a semantic web tool

from AGROVOC

• There was no standard model outside, we

were on our own

• We discussed three years the model to

which we wanted to convert AGROVOC

……but in the meantime AGROVOC was

used and translated into 20 languages

AGROVOC since 2004

Page 15: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3Semantic Relationships

Concept to

Concept

isA (hierarchy), isPestOf, hasPest

Concept to

Term

has_lexicalization

(links concepts to their lexical

realizations)

Term to

Term

isSynonymOf, isTranslationOf,

hasAcronym, hasAbbreviation

Term to

String

hasSpellingVariant, hasSingular

Page 16: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

The AGROVOC OWL model

Page 17: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3 More Semantics

MAIZE

UF corn

NT flint maize NT popcorn NT sweet corn

MILK

NT Milk Fat

NT Colostrum

NT Cow Milk

International Fund for Agricultural Development

UF IFAD

MAIZE

synonym corn

superclass-of flint maize used-to-make popcorn hybridized-into sweet corn

MILK

ingredient Milk Fat

ingredient Colostrum

superclass-of Cow Milk

International Fund for Agricultural

Development

acronym IFAD

Page 18: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

around 30,000 concepts

600000 labels in around 20 languages.

one-stop shop for terminological knowledge

related to agriculture in general

a knowledge base of related concepts organized

in ontological relationships (hierarchical,

associative, equivalence)

Is a concept/term/string based system

Concepts may be organized in multiple categories.

AGROVOC today

Page 19: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

The AGROVOC OWL model needs a revision

Ontological over commitment in many class/subclass hierarchies

Unnecessary complications through concept /term/string hierarchy

Our push of AGROVOC to the Semantic Web had enormous positive effects, among others

From 4 to 20 language versions

Defacto standard for indexing in many areas

More than 2000 downloads only in 2009

SKOS incorporated all our requirements

Evaluation of the Process

Page 20: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

http://www.w3.org/2004/02/skos/

Page 21: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

The AGROVOC SKOS Model

8171

1474

12332

skosxl:altLabel

skosxl:prefLabel

skos:broader

SKOS

Label

skos:broader

SKOSConcept

rdf:type

rdf:type

6211

skos:broader

Agrovoc

ConceptScheme

skos:topConceptOfskos:inScheme

SKOS

ConceptScheme

rdf:type

rdf:type

:bar

:foo

“corn”

“maize”

skosxl:literalForm

skosxl:literalForm

rdf:type

rdf:type

rdf:type

Page 22: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Ex:FAO

FAO@en

Food and

Agricultural

Organization@e

n

skos:prefLabel skos:altLabel

Ex:FAO

Skosxl: Label FAO abrevSkos: Label FAO Full

Skosxl:altLabel

skosxl:prefLabel

FAO@en Food and Agricultural

Organization@en

Ex: full form

Ex: acronym

form

skosxl:

literalForm

skosxl:

literalFormEx: SKOS

presentation

Ex: SKOS-XL presentation

SKOS and SKOS-XL presentation

Page 23: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3SKOS-XL output

<rdf:Description

rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type

rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf

:Description><rdf:Description

rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type

rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>

<skos:inScheme

rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/>

<skos:topConceptOf

rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Descri

ption><rdf:Description

rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610">

<literalForm xmlns="http://www.w3.org/2008/05/skos-xl#"

xml:lang="en">subjects</literalForm> <rdf:type

rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>

Page 24: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

SKOS presentation where there is no need to apply any

type of relationship between lexical units.

SKOS-XL presentation needs to apply the relationship

between lexical units so that it can be understood clearly.

This property is compared with OWL inverse property.

SKOS-XL , with the need to define labels as resources, as

with concepts, schemes and collections. This defines a

special type of lexical entity which is assigned a literal

chain which can be repeated for various units. So, the

prefLabel and altLabel can be distinguished easily in the

multi-lingual thesauri .

SKOS-XL gives more flexible semantic than SKOS for

modeling the vocabularies, specially multi-lingual thesaurus

like AGROVOC

SKOS and SKOS-XL presentation

Page 25: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

The

Conceptserver

Workbench

Page 26: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Is a web-based working environment for managing the

AGROVOC Concept Server

Facilitate the collaborative editing of multilingual

terminology and semantic concept information

It includes administration and group management

features

It includes workflows for maintenance, validation and

quality assurance of the data pool

The CS is accessible freely to everybody to facilitates

collaborative editing

The workbench

Page 27: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Architecture of the System

Page 28: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Concept/Term Management

Page 29: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Concept Relationship

Can create the concept-concept relationship

Inverse relationship is also created automatically

• Ex: If we create A affect B, then B is affected by A relationship is also

created

Page 30: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3Concept Image

• Name of the image with

description

• URL will point to the

image which will open in

an external

• Provide the source of

the image

• Can add more

translation in different

language

Page 31: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3 Concept Definition

• Add definition to the

selected concept

• Add translation in

different languages

• Provide the source of the

definition

• Creation and modified

date are set

automatically

Page 32: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Visualization of concepts and

Relationships in the Workbench

Page 33: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

AGROVOC Web Services

Page 34: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

The road map

• The AGROVOC concept scheme in it’s

proprietary OWL has been published on June

5, 2010 on version 1.0 of the concept server

workbench

• At the moment a patch to version 1.0 of the

workbench is developed to make it possible to

export AGROVOC SKOS

• AGROVOC SKOS will be published as linked

data.

• With version 2.0 of the workbench SKOS will

become the native format for AGROVOC

Page 35: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Giving it a try…….

A demo version of the AWB:

http://202.73.13.50:55234/agrovocdevv10d/ With all

functionalities, availabe to users for testing purpose.

Latest stable release version 1.0 : (read/write)

http://202.73.13.50:55381/agrovocv10i/

Latest stable release version 1.0 (Read only):

http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only

view privilege)

Page 36: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

• Getting UN vocabularies as building blocks

for the Linked Data Universe

• UNBIS thesaurus covers a broad range of

international policy and development issues

• AGROVOC very specialized, but they do

overlap partly

• Steps:

• Promoting together UNBIS Thesaurus and

AGROVOC

• Mapping Project?

Possible collaboration

Page 37: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

…and more: http://aims.fao.org

Page 38: AGROVOC: FAO’s multilingual thesaurus as a building block for linked open data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

UN

, D

PI

UN

BIIS

Th

es

au

rus T

eam

N

ew

Yo

rk, 2

01

0-0

9-0

3

Thank You!