improving automatic semantic tag recommendation through fuzzy ontologies

20
Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies Panos Alexopoulos, Manolis Wallace 7th International Workshop Semantic and Social Media Adaptation and Personalization Luxembourg, December 3-4, 2012

Upload: panos-alexopoulos

Post on 22-Apr-2015

637 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

Panos Alexopoulos, Manolis Wallace

7th International Workshop Semantic and Social Media Adaptation and Personalization

Luxembourg, December 3-4, 2012

Page 2: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

2

Introduction Problem Definition and Paper

Focus Approach Overview and Rationale

Proposed Framework Tagging Evidence Model Tagging Process

Framework Evaluation Evaluation Process Evaluation Results

Conclusions and Future Work

Agenda

Page 3: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

3

Problem Definition

Introduction

●Semantic tagging involves identifying and assigning to texts appropriate entities that reflect what the document actually talks about.

●One important challenge in this task is the correct distinction between the entities that play a central role to the document’s meaning and those that are just complementary to it.

●For example, consider the following text:

● “Annie Hall is a much better movie than Deconstructing Harry, mainly because Alvy Singer is such a well formed character and Diane Keaton gives the performance of her life”.

● The text mentions two films, yet the one it actually talks about is only “Annie Hall”, meaning that only this is an appropriate tag.

Semantic Tagging

Page 4: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

4

Problem Definition

Introduction

●A second challenge is the inference of appropriate tags even when these are not explicitly mentioned within the text.

●For example:

● “In June 1863, Colonel James Montgomery commanded a brigade in operations along the coast resembling his earlier Jayhawk raids. The most famous of his controversial operations was the Raid at Combahee Ferry in which 800 slaves were liberated with the help of Harriett Tubman”.

● The text describes a historical battle which took place in Beaufort County, South Carolina.

● This means that this location is an important geographical tag for the text, yet it is not explicitly mentioned within it.

Semantic Tagging

Page 5: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

5

Paper Focus

Introduction

● In a previous work we have already proposed a framework for semantic tagging through the exploitation of domain ontologies.

●The ontologies describe the domain(s) of the texts to be tagged and their entities serve as a source of possible tags for them.

●The key idea is that a given ontological entity is more likely to represent the text’s meaning (and thus be an accurate tag) when there are many ontologically related to it entities in the text.

● In this paper we revise and extend the above framework so as to enable it to exploit also fuzzy ontological information.

●Our assumption is that the fuzziness that may characterize some of the ontology’s relations can increase the evidential power of its entities and consequently the effectiveness of the tag recommendation process

Page 6: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

6

Approach Overview and Rationale

Proposed Framework

●Our existing framework targets the task of semantic tagging based on the intuition that a given ontological entity is more likely to represent the meaning the text when there are many ontologically related to it entities in the text.

●E.g. in the example text the entities “Alvy Singer” and “Diane Keaton” indicate that the text is about the film “Annie Hall”.

●That is because Alvy Singer is a character of this film and Diane Keaton an actor of it.

●All these entities and the relations between them are derived from one or more domain ontologies.

Page 7: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

7

Approach Overview and Rationale

Proposed Framework

●The extension we propose to this framework has to do with considering, where possible, fuzzy relations between entities rather than crisp ones.

●Fuzzy relations allow the assignment of truth degrees to vague ontological relations in an effort to quantify their vagueness.

●E.g. Instead of ‘‘Annie Hall is a comedy” we may say that “Annie Hall is a comedy to a degree of 0.7”

●Similarly, we may say that “Woody Allen is an expert director at human relations to a degree of 0.8”.

●Thus, by using a fuzzy ontology one can represent useful semantic information for the tag recommendation task in a higher level of granularity than with a crisp ontology.

Page 8: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

8

Approach Overview and Rationale

Proposed Framework

●For example, in the film domain, instead of having just the relation hasPlayedInFilm it is more useful to have the fuzzy relation wasAnImportantActorInFilm.

● E.g. “Robert Duvall was an important actor in Apocalypse Now to a degree of 0.6”.

●To see why this is useful consider the text “Robert Duvall’s brilliant performance in the film showed that his choice by Francis Ford Copola was wise”.

● If Duvall and Copola have collaborated in more than one film but in only one of them Duval had a major role (as captured by the fuzzy degree of his relation to the film) then this film is more likely to be the subject of this text.

Page 9: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

9

Framework Components

Proposed Framework

●Our proposed framework assumes the availability of a fuzzy ontology for the domain of the texts to be tagged and defines two components:

● A Tag Fuzzy Ontological Evidence Model that contains entities that may serve as tag-related evidence for the application scenario and domain at hand.

●Each entity is assigned evidential power degrees which denote its usefulness as evidence for the tag recommendation task.

● A Tag Recommendation Process that uses the evidence model to determine, for a given text, the ontological entities that potentially represent its content.

●A confidence score for each entity is used to denote the most probable tags.

Page 10: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

10

Tagging Evidence Model

Proposed Framework

●Defines for each ontology entity which other instances and to what extent should be used as evidence towards the correct determination of the texts’ tags.

●It consists of entity pairs where a particular entity provides quantified evidence for a another one.

Page 11: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

11

Evidence Model Construction

Proposed Framework

●Construction of the evidence model depends on the characteristics of the domain and the texts.

●The first step of the construction is manual and involves:

● The identification of the concepts whose instances are expected to be used as tags (e.g. military conflicts, films etc.)

● The determination, for each of these concepts, of the related to them concepts whose instances may serve as tag evidence:

●For example, in texts that review films, some concepts whose instances may act as tag evidence are related directors, actors and characters.

● The identification, for each pair of evidence and target concept, of the fuzzy relation paths that links them.

Page 12: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

12

Evidence Model Construction

Proposed Framework

●The result of this first step is a tag evidence concept mapping like the following:

●This mapping is typically small so its manual construction is not difficult.

Page 13: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

13

Evidence Model Construction

Proposed Framework

●Based on such mappings, the second step of the construction is automatic and involves the generation of the tag-evidence entity pairs along with a tag evidential strength.

●This strength is: ● Proportional to the fuzzy degree of the relation linking the evidence entity with

the tag.● Inversely proportional to the evidential entity’s own ambiguity as well as to the

number and fuzzy degrees of the other tags it provides evidence for.

●For example, “Woody Allen” provides evidence for the film “Annie Hall” to a strength of 0.02 because he has directed many other films while the character “Alvy Singer” has evidential strength of 1 as it appears only on this film.

Page 14: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

14

Tag Recommendation Process

Proposed Framework

●Step 1: We extract from the text the terms that possibly refer to tag entities.

●Step 2: We extract from the text the terms that possibly refer to evidential entities

●Step 3: We consider as candidate tag entities not only those found within the text but practically all those that are related to instances of the evidential concepts in the ontology.

● E.g. If we find the term “Woody Allen” then all his films are candidate tag entities.

●Step 4: Using the evidence model of the previous slide we compute for each candidate tag entity the confidence that it actually represents the text’s meaning.

●Note: The evidence model is assumed to have been calculated offline and stored in an index, so as to make the above process more efficient.

Page 15: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

15

Evaluation Process

Framework Evaluation

●Two tagging scenarios:● Film reviews.● Texts describing military conflicts.

●Fuzzy ontologies for both domains, based on the manual fuzzification of a small portion of DBPedia and Freebase semantic data.

●Effectiveness was measured by determining the number of correctly tagged texts, namely texts whose highest ranked tags were the correct ones

Description

Page 16: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

16

Evaluation Process

Framework Evaluation

●100 texts describing 20 distinct films that were similar to each other in terms of genre, actors and directors and thus more difficult to distinguish between them in a given review.

●Fuzzy Film Ontology:

● Concepts: Film, Actor, Director, Character

● Relations: wasAnImportantActorInFilm, isFamousForDirectingFilm, wasCharacterInFilm.

Film Reviews Scenario

●100 texts describing 100 miltary conflicts that were similar to each other in terms of participants and places

●Fuzzy Conflict Ontology:

● Concepts: Location, Military Conflict, Military Person

● Relations: tookPlaceNearLocation, wasAnImportantPartOfConflict, playedMajorRoleInConflict, isNearToLocation

Miltary Conflicts Scenario

Page 17: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

17

Evaluation Results

Framework Evaluation

●We measured tagging effectiveness by determining the number of correctly tagged texts, namely texts whose highest ranked films were the correct ones.

●For comparison purposes, we performed the same process using a crisp version of the ontologies.

●Results:

Page 18: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

18

Key Points

Conclusions and Future Work

●We proposed a novel framework that exploits fuzzy semantic information for automatically generating semantic tags for text documents

●This had two challenges:● Distinguishing correctly between the entities that play a central role to the

document’s meaning and those that are just complementary to it.● Inferring appropriate tags even when these are not explicitly mentioned within

the text.

●Our approach has been based on the customized utilization of fuzzy domain-specific ontological relations for extracting and evaluating tag “evidence” from within the text

●The added value that the consideration and exploitation of fuzziness brought to the tag recommendation task was experimentally verified through experiments in different domains

Page 19: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

19

Framework Extensions

Conclusions and Future Work

●One important obstacle for the wider applicability of our approach is the bottleneck of acquiring (through development or reuse) the required fuzzy ontological information for the domain at hand.

●For that reason, our future work will focus on determining automated methods for fuzzifying crisp ontological facts:

● Data mining● Social network analysis● Crowdsourcing

Page 20: Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

20

Contact iSOCO

Thank you!

Questions?

Barcelona

Tel +34 935 677 200

Edificio Testa A

C/ Alcalde Barnils, 64-68

St. Cugat del Vallès

08174 Barcelona

Valencia

Tel +34 963 467 143

Oficina 107

C/ Prof. Beltrán Báguena, 4

46009 Valencia

Pamplona

Tel +34 948 102 408

Parque Tomás

Caballero, 2, 6º-4ª

31006 Pamplona

Dr. Panos AlexopoulosSenior Researcher

[email protected]

Madrid

Tel +34 913 349 797

Av. del Partenón, 16-18, 1º7ª

Campo de las Naciones

28042 Madrid