ppt - cse, iit bombay

53
Computational Lexical Semantics Om Damani, IIT Bombay

Upload: maxisurgeon

Post on 11-May-2015

1.132 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: ppt - CSE, IIT Bombay

Computational Lexical Semantics

Om Damani, IIT Bombay

Page 2: ppt - CSE, IIT Bombay

Study of Word Meaning

Word Sense Disambiguation Word Similarity WordNet Relations

Do we really know the meaning of meaning We will just take the dictionary definition as

meaning

Page 3: ppt - CSE, IIT Bombay

Word Sense Disambiguation (WSD)

WSD Applications: Search, _____, ______

Page 4: ppt - CSE, IIT Bombay

Sense Inventory

Wordnet, Dictionary etc. Plant in English Wordnet (#senses ??): Noun Senses:

plant, works, industrial plant (buildings for carrying on industrial labor) "they built a large plant to manufacture automobiles"

plant, flora, plant life ((botany) a living organism lacking the power of locomotion)

plant (an actor situated in the audience whose acting is rehearsed but seems spontaneous to the audience)

plant (something planted secretly for discovery by another) "the police used a plant to trick the thieves"; "he claimed that the evidence against him was a plant"

Page 5: ppt - CSE, IIT Bombay

Sense Inventory ..

Plant (Verb Senses): plant, set (put or set (seeds, seedlings, or plants) into the ground)

"Let's plant flowers in the garden" implant, engraft, embed, imbed, plant (fix or set securely or

deeply) "He planted a knee in the back of his opponent"; "The dentist implanted a tooth in the gum"

establish, found, plant, constitute, institute (set up or lay the groundwork for) "establish a new department"

plant (place into a river) "plant fish" plant (place something or someone in a certain position in order

to secretly observe or deceive) "Plant a spy in Moscow"; "plant bugs in the dissident's apartment"

plant, implant (put firmly in the mind) "Plant a thought in the students' minds"

Page 6: ppt - CSE, IIT Bombay

How many Senses of सच्चा� Noun: सत्यवा�दी�, सच्चा�, सत्यभा�षी�, सत्यवाक्ता� - वाह जो� सत्य बो�लता� ह� "आधु�नि�क

समा�जो मा� भा� सत्यवा�दिदीय� क� कमा� �ह� ह / यथा�था"वा�दी� ह��# क# क�रण कई ल�ग श्य�मा क# दुश्मा� बो� गए ह+"

Adjective(6) सत्यवा�दी�, सच्चा�, सत्यभा�षी�, सत्यवाक्ता� - जो� सत्य बो�लता� ह� "य�धिधुधि-र एक सत्यवा�दी�

व्यक्तिक्ता था#" ईमा��दी�र, छलह��, नि�ष्कपट, नि�4कपट, रिरजो�, ऋजो�, दीय��तादी�र, सच्चा�, अप शु��, सत्यपर -

क्ति9त्त मा� सद्वृ<त्तित्त य� अच्छी� ��यता रख�#वा�ल�, 9�र� य� छल-कपट � कर�#वा�ल� "ईमा��दी�र व्यक्तिक्ता सम्मा�� क� प�त्र ह�ता� ह "

वा�स्तानिवाक, यथा�था", सच्चा�, सह�, असल�, वा�स्तावा, अक�ल्पनि�क, अकल्पिल्पता, अकE ट, प्रकG ता - जो� वा�स्तावा मा� ह� य� हुआ ह� य� निबोल्क� ल ठीJक "मा+�# अभा�-अभा� एक अनिवाश्वस��य पर वा�स्तानिवाक घट�� स��� ह "

सच्चा�, असल� - जो� झूEठी� य� बो��वाट� � ह� "वाह भा�रता मा�N क� सच्चा� सपEता ह " खर�, 9�ख�, सच्चा� - जो� ईमा��दी�र�, नि�ष्पक्षता�, न्य�य आदिदी क# आधु�र पर ह� "हमा� खर�

सQदी� कर�� 9�निहए" खर�, सच्चा�, स�धु� - निबो�� निकस� बोह��# य� समाझूQता� क# य�नि� स�धु� "वाह इता�� खर� �ह� ह

जिजोता�� दिदीख�ता� ह “ How do you know these are different senses

Hint: think translation

Page 7: ppt - CSE, IIT Bombay

How many Senses of आदीमा� आदीमा�, प�रुषी, मादी", �र - �र जो�निता क� मा��ष्य "आदीमा� और औरता क� शु�र�रिरक

सWर9��एN त्तिभान्न ह�ता� ह+" मा��वा, आदीमा�, इWस��, इन्स��, इ�स��, मा��ष्य, मा���षी, मा���स, मा��षी, �र - वाह निद्वृपदी

प्र�ण� जो� अप�# बो�जिYबोल क# क�रण सबो प्र�त्तिणय� मा� श्रे#- ह और जिजोसक# अWताग"ता हमा,आप और सबो ल�ग ह+ " आदीमा� अप�� बो�जिY क# क�रण सभा� प्र�त्तिणय� मा� श्रे#- ह "

व्यक्तिक्ता, मा��स, आदीमा�, शुख़्स, शुख्स, जो�, बोWदी�, बोन्दी� - मा��ष्य जो�निता य� समाEह मा� स# क�ई एक "इस क�र मा� दी� ह� आदीमा� बो ठी सकता# ह+"

�Qकर, स#वाक, दी�स, अ��9र, ख़ा�दिदीमा, मा�ल�जि^मा, मा�ल�जिजोमा, आदीमा�, टहल�आ, प�षी"दी, ल_डा�, अ��ग, अ��9�रक, अ��9�र�, अ��य�य�, प�बोWदी, प�बोन्दी, �फर, अत्तिभा9र, भाGत्य, गण, अत्तिभासर, अत्तिभास�र� - वाह जो� स#वा� करता� ह� "मा#र� आदीमा� एक हफ्ता# क# क्तिलए घर गय� ह "

पनिता, मादी", शुQहर, घरवा�ल�, धिमाय�N, आदीमा�, ख़ासमा, खसमा, स्वा�मा�, अधु�शु, ��था, क�Wता, कW ता, परिरण#ता�, वा�रधियता�, दीधियता - स्त्र� क� दृधिd स# उसक� निवावा�निहता प�रुषी "शु�ल� क� आदीमा� निकस��� करक# परिरवा�र क� प�ल�-प�षीण करता� ह “

How do you know these are different senses Hint: think translation

Page 8: ppt - CSE, IIT Bombay

WSD: Problem Statement

Given a string of words (sentence, phrase, set of key-words), and a set of senses for each word, decide the appropriate sense for each word.

Example: Translate ‘Where can I get spare parts for textile plant ?’ to Hindi

Solution: ??

Page 9: ppt - CSE, IIT Bombay

Solution Approaches

Solution depends on what resources do you have: Definition, Gloss Topic/Category label for each sense definition Selectional preference for each sense Sense Marked Corpora Parallel Sense-Marked Corpora

Page 10: ppt - CSE, IIT Bombay

Combinatorial Explosion Problem I saw a man who is 98 years old and can still

walk and tell jokes See(26), man(11), year(4), old(8), can(5).

Still(4), walk(10), tell(8), joke(3). 4,39,29,600 sense combinations Solution: Viterbi ??

Page 11: ppt - CSE, IIT Bombay

Dictionary-Based WSD

Page 12: ppt - CSE, IIT Bombay

Dictionary-Based WSDThe bank did not give loan to him though he offered to mortgage his boat.

bank Gloss

Example

a financial institution that accepts deposits and gives loan

“he cashed a check at the bank”, “that bank holds the mortgage on my home”

bank Gloss

Example

the slope beside a body of water

“they pulled the boat up on the bank”, “he watched the currents from the river bank ”

The bank did not give loan to him though he offered to mortgage his boat.

Page 13: ppt - CSE, IIT Bombay

How to improve the LESK further Give an example where the algo fails – say for bank

“The bank did not give loan to him though he offered his boat as collateral.”

Problem: collateral is related to the bank but the relation does not come out clearly

Solution: See if the definition of bank and definition of collateral share a term: Collateral: security pledged for loan repayment

Problem: Can you give an example where the new algorithm fails too

Page 14: ppt - CSE, IIT Bombay

LESK AlgorithmFunction Lesk (word, sentence) returns best sense of word context := set of words in sentence; for each sense in senses of word do sense.signature := GetSignature (sense); sense.relevance := ComputeRelevance

( sense.signature, context ); end best-sense := MaxRelevantSense () ; if ( best-sense.relevance == 0 ) best-sense := GetDefaultSense (word); return best-sense;

GetSignature ( sense ): Get all words in example and gloss of senseComputeRelevance ( signature, context ): number of common words

Page 15: ppt - CSE, IIT Bombay

GetSignature ( sense )

All words in example and gloss of sense All words in gloss of sense

All words in gloss of all words in the gloss of the given sense All words in gloss of all words in gloss of all words in gloss

…..

Problem: Including the right sense of each word in gloss needs WSD Including all senses of all words in gloss will lead to sense-

drift Possible Solution: All context words in a sense

marked corpora

Page 16: ppt - CSE, IIT Bombay

Ideal Signature

For each word, get a Vector of all the words in the language

Work with a |V|x|V| Matrix Iterate over it, till it converges

Page 17: ppt - CSE, IIT Bombay

ComputeRelevance( signature, context ) number of common words

Favors longer definitions | Set-Intersection | / | Set-Union |

Define Relevance between two words Synonyms Specialization, Generalization has to be

accounted for – canoe and boat Sum of Relevance between all word pairs Weigh different terms differently – maybe

based on TF-IDF score

Page 18: ppt - CSE, IIT Bombay

GetDefaultSense ( word )

The most frequent sense The most frequent sense in a given domain The most frequent sense as per the topic of

the document

Page 19: ppt - CSE, IIT Bombay

Power of the LESK Schema

Signature can even be a topic/domain code: finance, poetry, geo-physics

All variations of ComputeRelevance function are still applicable

Page 20: ppt - CSE, IIT Bombay

Possible Improvements

LESK gives equal weightage to all senses - ‘right’ sense should be given more weight Iterative fashion – one at a time – most certain

first Page Rank like algo

Give more weightage to Gloss than to Example

Page 21: ppt - CSE, IIT Bombay

Page-Rank-LESK

Page 22: ppt - CSE, IIT Bombay

Fundamental Limitation of Dictionary Based Methods Depends too much on the exact word

Another dictionary may use different gloss and example

Use the context words from a tagged corpus as signature

Page 23: ppt - CSE, IIT Bombay

Supervised Learning

Lesk-like methods depend too much on the exact word Another dictionary may use different gloss and

example Use a sense-tagged corpora Employ a machine learning algorithm

Page 24: ppt - CSE, IIT Bombay

Supervised Learning

Machine can only learn what we ask it to Collocation feature

Relative position (2 words to the left) Words and POS “An electric guitar and bass player stand off to one side, not

really part of the scene, ...” [wi−2,POSi−2,wi−1,POSi−1,wi+1,(20.2) POSi+1,wi+2,POSi+2] [guitar, NN, and, CC, player, NN, stand, VB]

Bag-of-words feature [fishing, big, sound, player, fly, rod, pound, double, runs, playing,

guitar, band] [0,0,0,1,0,0,0,0,0,0,1,0]

Page 25: ppt - CSE, IIT Bombay

Still the data sparsity problem remainsAssumption: features are conditionally independent given the word sense

a simple binary bag of words vector defined over a vocabulary of 20 words would have --- possible feature vectors.

Naïve Bayes Classifier

Page 26: ppt - CSE, IIT Bombay

Computing Naïve Bayes Probabilities

if a collocational feature such as [wi.2 = guitar] occurred 3 times for sensebass1, and sense bass1 itself occurred 60 times in training, theMLE estimate is P( f j |s)= 0.05.

it’s hard for humans to examine Naïve Bayes’s workings and understand its decisions. Hence use Decision lists

Page 27: ppt - CSE, IIT Bombay

Decision Lists

Rule Sense⇒fish within window ⇒ bass1

striped bass ⇒ bass1

guitar within window ⇒ bass2

bass player ⇒ bass2

piano within window ⇒ bass2

tenor within window ⇒ bass2

sea bass ⇒ bass1

Page 28: ppt - CSE, IIT Bombay

How to Create Decision Lists

Which feature has the most discrimination power

Seems same as max

P (Sense | f) Need Decision Trees

Page 29: ppt - CSE, IIT Bombay

Selectional Restrictions and Selectional Preferences “In our house, everybody has a career and none of

them includes washing dishes,” he says. In her kitchen, Ms. Chen works efficiently, cooking

several simple dishes. Wash[+WASHABLE], cook[+EDIBLE] Used more often for elimination than selection Problem: Gold-rush fell apart in 1931, perhaps

because people realized you can’t eat gold for lunch if you’re hungry.

Solution: Use these preferences as features/probabilities

Page 30: ppt - CSE, IIT Bombay

Selectional Preference Strength eat edible. ⇒ be ??⇒ Strength: P(c) vs P(c|v) Kullback-Leibler

Divergence (Relative Entropy)

selectional association : contribution of that class to general selectional preference of the verb

Page 31: ppt - CSE, IIT Bombay

Selection Association

a probabilistic measure of the strength of association between a predicate and a class dominating the argument to the predicate

Verb, Semantic Class, Assoc, Semantic Class, Assoc

read WRITING 6.80 ACTIVITY -.20 write WRITING 7.26 COMMERCE 0 see ENTITY 5.79 METHOD -0.01

Page 32: ppt - CSE, IIT Bombay

How do we use Selection Association for WSD Use as a relatedness model select the sense with highest selectional

association between one of its ancestor hypernyms and the predicate.

Page 33: ppt - CSE, IIT Bombay

Minimally Supervised WSD

Supervised: needs sense tagged corpora Dictionary based: needs large examples and

gloss Supervised approaches do better but are

much more expensive Can we get best of both words

Page 34: ppt - CSE, IIT Bombay

Bootstrapping

Seed-set L0 of labeled instances, a much larger unlabeled corpus V0

Train a decision-list classifier on seed-set L0 Uses this classifier to label the corpus V0 Add to the training set examples in V0 that it

is confident about Iterate { retrain decision-list classifier }

Page 35: ppt - CSE, IIT Bombay

Bootstrapping Success Depends On Choosing the initial seed-set

One sense per collocation One sense per discourse

Samples of bass sentences extracted from the WSJ using the simple correlates play and fish. We need more good teachers – right now, there are only a

half a dozen who can play the free bass with ease. And it all started when fishermen decided the striped bass

in Lake Mead were too skinny.

Choosing the ‘confidence’ criterion

Page 36: ppt - CSE, IIT Bombay

WSD: Summary

It is a hard problem In part because it is not a well-defined

problem Or it cannot be well-defined Because making sense of ‘Sense’ is hard

Page 37: ppt - CSE, IIT Bombay

Hindi Wordnet

Wordnet - A lexical database Inspired by the English WordNet Built conceptually Synset (synonym set) is the basic building

block.

Page 38: ppt - CSE, IIT Bombay

Entry in Hindi Wordnet

Synset {ग�य,गऊ, ग य�, धु#��}

{gaaya ,gauu, gaiyaa, dhenu}, Cow

Gloss Text definition

स�गवा�ल� एक शु�क�ह�र� मा�दी� 9Qप�य�(siingwaalaa eka shaakaahaarii maadaa choupaayaa)

(a horny, herbivorous, four-legged female animal)

Example sentence

निहन्दू ल�ग ग�य क� ग� मा�ता� कहता# ह+ एवाW उसक� पEजो� करता# ह+।(hinduu loga gaaya ko go maataa kahate hain evam usakii puujaa karate hain)

(The Hindus considers cow as mother and worship it.)

Page 39: ppt - CSE, IIT Bombay

Subgraph for Noun

ग�य, गऊ (gaaya ,gauu) Cow

9Qप�य�,पशु�(chaupaayaa, pashu)Four-legged animal

स�गवा�ल� एक शु�क�ह�र� मा�दी� 9Qप�य�(siingwaalaa eka sakaahaarii maadaa choupaayaa)A horny, herbivorous, four-legged female animal)

पग�र��� ( paguraanaa) ruminate

बो ल (baila) Ox

क�माधु#��kaamadhenuA kind of cow

मा �� ग�यmainii gaayaA kind of cow

था� (thana) udder

पENछ(puunchh ) Tail

शु�क�ह�र� (shaakaahaarii) herbivorous

Hypernym

Attribute

Hyponym

Gloss

Ability Verb

meronym

Antonym

Page 40: ppt - CSE, IIT Bombay

Subgraph for Verb

र���,रुदी� कर�� (ronaa, rudan karanaa)

to weep

भा�वा�त्तिभाव्यक्तिक्ता कर�� (bhaavaabhivyakti karanaa)

to express

हNस�� (hansanaa)

to laugh

आNसE बोह��� (aansuu bahaanaa)

to weep

क्तिससक�� (sisakanaa)

to sob

रुल��� (rulaanaa)

to make cry

Hypernym

Antonym

Gloss Troponym

Causative Verb

Entailment

Page 41: ppt - CSE, IIT Bombay

Marathi Wordnet (Noun)

ख�डा

र��

बो�ग

आWबो�लिंklबोE

माूEळ

मा�ळ#,ख�डा,फ�Wद्या�,प��# इत्य�दीp�� य�क्ता अस� वा�स्पनितानिवाशु#षी:"झू�डा# पय�"वारण शु�Y करण्य�9# क�मा करता�ता"

झू�डा, वाGक्ष, तारू

वा�स्पता�

MERONYMY

HOLONYMY

H Y P E R N Y M Y

H Y P O N Y M YGLOSS

Page 42: ppt - CSE, IIT Bombay

Word Similarity

In Lesk and other algo, we need to measure how related two words are

Simplest measure: pathLength - #edges in shortest path between sense nodes c1 and c2

sim (c1,c2) = −log pathlen(c1,c2)

wordsim(w1,w2) = max (c1 senses(∈ w1), c2 senses(∈ w2)) sim(c1,c2)

Page 43: ppt - CSE, IIT Bombay

Path Length: Limitations

All edges are not equal Compare medium of

exchange and standard with coin and nickel

Need a distance measure on edges

Page 44: ppt - CSE, IIT Bombay

Information Content Word Similarity

LCS(c1,c2) = the lowest common subsumer, i.e., the lowest node in thehierarchy that subsumes (is a hypernym of) both c1 and c2

sim (c1,c2) = −log P(LCS(c1,c2))

Page 45: ppt - CSE, IIT Bombay

IC Similarity: Limitations

A concept is not similar to itself using the previous defn

Word similarity is not about Information Contents. It is about commanality vs differences:

Page 46: ppt - CSE, IIT Bombay

Overlap based Similarity

Previous methods may not work for words belonging to different classes: car and petrol

similarity(A,B) = overlap(gloss(A), gloss(B)) + overlap(gloss(hypo(A)), gloss(hypo(B)))+ overlap(gloss(A), gloss(hypo(B))) + overlap(gloss(hypo(A)),gloss(B))

Page 47: ppt - CSE, IIT Bombay

WORD SIMILARITY: DISTRIBUTIONAL METHODSpointwise mutual information

Page 48: ppt - CSE, IIT Bombay

Similarity using Feature Vectors

Page 49: ppt - CSE, IIT Bombay

Cosine Distance

Dot product favors long vectors

Page 50: ppt - CSE, IIT Bombay

Conclusion

Lot of care is needed in defining similarity measures

Impressive results can be obtained once similarity is carefully defined

Page 51: ppt - CSE, IIT Bombay

Backup

Page 52: ppt - CSE, IIT Bombay

OrgLESK : Taking Signature of Context Words into Account for Relatedness Disambiguating “pine cone”

Neither ‘pine’ nor ‘cone’ appears in each other definitions

pine 1 a evergreen tree with needle-shaped leaves and

solid wood 2 waste away through sorrow or illness

cone 1 solid body which narrows to a point 2 something of this shape whether solid or hollow 3 fruit of certain evergreen trees

Page 53: ppt - CSE, IIT Bombay

Does the Improvement Really Work Problem: Collateral has not one but many senses:

Noun: collateral (a security pledged for the repayment of a loan) Adjective S: (adj) collateral, indirect (descended from a common ancestor

but through different lines) "cousins are collateral relatives"; "an indirect descendant of the Stuarts"

S: (adj) collateral, confirmative, confirming, confirmatory, corroborative, corroboratory, substantiating, substantiative, validating, validatory, verificatory, verifying (serving to support or corroborate) "collateral evidence"

S: (adj) collateral (accompany, concomitant) "collateral target damage from a bombing run"

S: (adj) collateral (situated or running side by side) "collateral ridges of mountains“

Solution ??