contextualizing ontologies with ontolight : a pragmatic approach

14
Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič

Upload: mason-valencia

Post on 02-Jan-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič. Contextualizing Ontologies With Ontolight : A Pragmatic Approach. Outline. Ontology Ontolight Definition Grounding Population Applications Integration in OntoGen Demo. What is ontology?. - PowerPoint PPT Presentation

TRANSCRIPT

Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič

Outline

Ontology Ontolight

Definition Grounding Population

Applications Integration in OntoGen Demo

What is ontology?

Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts.

Generally it consist of Classes: sets, collections, or types of objects Instances: the basic or "ground level" objects Relations: ways that objects can be related to one another

It can be used … as schema for knowledge management system, … to reason about the objects within that domain, etc.

Sample Ontology

Examples of Real-world Ontologies AgroVoc

Multilingual thesaurus for the field of Agriculture, Forestry, Fisheries, Food Security and related stuff

Consists of terms in different languages, thesaurus relationships between terms

Broader, narrower, related

ASFA Thesaurus used for annotating bibliography related to aquatic

science literature EuroVoc

Multilingual thesaurus used by European institutions Acquis Communitarian corpus is annotated by EuroVoc

Cyc Knowledge base, formalization of fundamental human knowledge

Dmoz – The Open Directory Project Worlds largest directory of WWW, maintained by volunteer editors

What is Ontolight?

Simple model covering most of the well known light-weight ontologies Stores ontology like a rich graph

Defined as: List of languages used for lexical terms (covers

multliliguality) List of class-types (types of nodes in the graph) List of classes (nodes in the graph) List of relation types (types of links in the graph) List of relations (links in the graph) Grounding model

A function which proposes a set of classes for a given instance Classification in machine learning

Grounding

Mutliclass classification model trained on the instances of ontology In case of Dmoz web pages In case of EuroVoc EU legislation

We used centroid-based classifier Calculates a centroid vector for each class Uses knowledge of hierarchy Classification performed by kNN algorithm Highly scalable – can handle 100s of thousands of

classes

Population

Takes instance as an input Output is a list of suggested classes Example from EuroVoc

Instance: “Slovenia and Croatia are having a fishing industry” Output:

OntoGen

Ontology construction and learning

Semi-Automatic: Text-mining methods

provide suggestions and insights into the domain

The user can interact with parameters of text-mining methods

All the final decisions are taken by the user

Data-Driven: Most of the aid provided

by the system is based on some underlying data provided by the system

Instances are described by features extracted from the data (e.g. bag-of-words vectors)

Contextualized ontology generation

Ontolight is integrated with Ontogen Helps at new ontology generation by means of

existing ontologies User loads Ontolight into Ontogen at start

Suggestion methods: Concept suggestion

Offers concepts from loaded Ontolight as possible sub-concepts

Name suggestion Offers names of concepts from Ontolight as possible

concept names All suggestions are integrated in semi-automatic

manner

Concept suggestion

User selects concept User selects Ontolight OntoGen classifies each

document into context – Ontolight ontology

Concepts with most documents are provided as suggestions to the user

Name suggestion

User selects concept OntoGen classifies each

document into context – loaded Ontolight ontologies

Names of concepts with most classified documents are provided as suggestions to the user

AgroVoc and EuroVoc applied to Yahoo finance data

Demo