guessing hierarchies and symbols for word meanings through hyperonyms and conceptual vectors mathieu...

21
Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France http://www.lirmm.fr/~lafourcade

Post on 19-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Guessing Hierarchies and Symbols

for Word Meanings throughHyperonyms and Conceptual

Vectors

Mathieu LafourcadeLIRMM - France

http://www.lirmm.fr/~lafourcade

Page 2: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Overwiew & Objectiveslexical semantic representations

conceptual vector model (cvm) autonomous learning by the system from a given « semantic space » (ontology)

Constructing texonomies Hierarchical - findind hyperonyms Multiple inheritance - views ambiguity as noise

towards self contained WSD annotations « I made a deposit at the bank » « I made a deposit at the bank<g:money> »

Page 3: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Conceptual vectorsvector space

• An ideaConcept combination — a vector

• Idea space= vector space

• A concept= an idea = a vector V with augmentation: V + neighboorhood

• Meaning space = vector space + {v}*

Page 4: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

2D view of « meaning space »

“cat”

“product”

Page 5: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Conceptual vectors Thesaurus

• H : thesaurus hierarchy — K conceptsThesaurus Larousse = 873 concepts

• V(Ci) : <a1, …, ai, … , a873>

aj = 1/ (2 ** Dum(H, i, j))

1/41 1/41/41/161/16 1/64 1/64

2 64

Page 6: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Conceptual vectors Concept c4:peace

peace

hierarchical relations

conflict relations

The world, manhood society

Page 7: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Conceptual vectors Term “peace”

c4:peace

Page 8: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

finance

profitexchange

Page 9: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Angular distance

• DA(x, y) = angle (x, y)• 0 DA(x, y) • if 0 then x & y colinear — same idea• if /2 then nothing in common• if then DA(x, -x) with -x — anti-idea of x

x’

y

x

Page 10: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Angular distanceDA(x, y) = acos(sim(x,y))

DA(x, y) = acos(x.y/|x||y|))

DA(x, x) = 0

DA(x, y) = DA(y, x)

DA(x, y) + DA(y, z) DA(x, z)

DA(0, 0) = 0 and DA(x, 0) = /2 by definition

DA(x, y) = DA(x, y) with 0

DA(x, y) = - DA(x, y) with < 0

DA(x+x, x+y) = DA(x, x+y) DA(x, y)

Page 11: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Thematic distance

• Examples• DA(tit, tit) = 0

• DA(tit, passerine) = 0.4

• DA(tit, bird) = 0.7

• DA(tit, train) = 1.14

• DA(tit, insect) = 0.62

tit = insectivorous passerine bird …

Page 12: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Some vector operations

Addition : Z = X Yzi = xi + yi vector Z is normalized

Term to term mult : Z = X Yzi = (xi * yi)1/2

vector Z is not normalized

Weak contextualization : Z = X (X Y) = (X,Y)

“Z is X augmented by its mutual information with Y”

Page 13: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

2D view of weak contextualization

Y

X

XY

XY

Y(XY)

XY

X(XY)

Page 14: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Autonomous learning 1/2set of known words K, set of unknow words U

• revise a word w of K OR (try to) learn a word w of U

• From the web : for w ask for a def D• specific sites : dicts, synonyms list, etc. def analysis• general sites : google, etc. corpus analysis

• for each word wd of D• if not in K then add wd to U AND add VO to V*• otherwise get the vector of wd AND add V(wd) to V*

compute the new vector of w from def(D) and V*

98870 words for 400000 senses (vectors) learned in 3 yearsFrench« ever » looping process

Page 15: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Autonomous learning 2/2

insectivorous passerine bird …

ADJ, …

N, GOV …

PH

TXT

V V V

V

V

V

V

V

(X,Y)

Weighted sum

Page 16: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade
Page 17: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Hyperonyms identifications• Extraction

• Try all terms too costly and unproductive

• Extract potential candidates From definitions, cooccurence lists etc.

Ex: Cand(emerald) = precious stone, stone, beryl, gem, …

• Evaluation of cand (m) to meaning (m)• Contextualize : (c,m) = c (c m) • Retain c such as (c,m) is the closest to m

• Loop: extracting hyper helps identifying meanings

Page 18: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Émeraude/pierre précieuse Émeraude/béryl

béryl

Pierre précieuse

Gemme/pierre précieuse Gemme/bourgeon Gemme/résine

closest vector

Émeraude/gemme

v v

vv v

v

Émeraude/pierre précieuse Émeraude/béryl

béryl

Pierre précieuse

Gemme/pierre précieuse Gemme/bourgeon Gemme/résine

Émeraude/gemme

v v

vv v

v

Page 19: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Émeraude/pierre précieuse Émeraude/béryl

béryl

Pierre précieuse

Gemme/pierre précieuse Gemme/bourgeon Gemme/résine

0.81

0.9

0.70.85

Émeraude/béryl

béryl

Pierre précieuse

Gemme/pierre précieuse

0.81

0.9

0.85

Émeraude/vert Émeraude/couleur

Émeraude/vert

Vert/couleur des signaux

Couleur/matièreCouleur/sensation

Vert/couleur

Page 20: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Voiture/wagon

wagon

Moyen de transport

véhicule/Moyen de transport véhicule/vecteur

automobile

Voiture/automobile

Cheval/moyen de transport

Cheval/mammifère

mammifère

Cheval/viande

Viande/nourriture

aliment

nourriture

artefact

Cheval/unité de puissance

animal

hypo

hypo

Page 21: Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France lafourcade

Last wordsSwitching of representation

From subsymbolic to symbolic … and vice-versa readabily of symbols … of words

global and local test functions for vector quality assessment decision taking about number of meanings … or views detectors when combined to lexical functions (antonymy, etc.)

the basis for self adjustement toward a vector space of constant density wsd as a reduction of noise (in context or out of context) unification of ontologies self emergent structuration of terminology