cs626/449 : speech, nlp and the web/topics in ai programming (lecture 4: word sense disambiguation;...

46
CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Upload: marjory-townsend

Post on 11-Jan-2016

227 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

CS626/449 : Speech, NLP and the Web/Topics in AI Programming

(Lecture 4: Word Sense Disambiguation; Wordnet)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

Page 2: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Word Sense Disambiguation• WSD is a well know difficult problem

• Questions: Should the approach be– Knowledge based– Statistical– Combined

• Resources– Sense marked (annotated corpora)– Sense repository

• Training– Unsupervised– Supervised– Semi supervised

Page 3: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

SynonymDistribution principle:

Words A and B are called ‘synonyms’ if their distribution is identical in a corpus. That means they can replace each other in any context. (Strong requirement – ideal)

Pure synonym:If A and B are synonyms in all context (can replace in all contexts) they are pure synonyms. It has been very difficult to find pure synonyms.

Question: How to ensure replaceability in– Syntax– Semantics– Pragmatics– Discourse

Page 4: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Example of replaceabilityConsider {mother, mummi, amma}

1. Syntax – yes: mother, mummi, ammi – noun: ex. Mother smiles.1. Constituent Parse Tree2. Dependency Parse

2. Semantics: (Semantic Roles) replaceable

3. Pragmatics: register (fails)1. A formal situation, ex. Dear Sir, Grant me leave for one day as my mother has to undergo an

operation2. A proverb, ex. Mother makes the nation

Register is linguistic memory specific to a situation

S

SS

SS

Mother smiles

smiles

mother

agent

Page 5: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Relational and Componential Semantics

Relational Semantics (Words can disambiguate each other) vs. Componential Semantics (Words need features for disambiguation)

Example

Possible Features: Animate, Human, Carnivorous, Small, Moving

Componential SemanticsSemantic Feature Vector for

cat (animal): <1,0,1,1,1>cat (expert): <1,1,U,U,1>

Relational Semanticscat (animal): {cat, feline}cat (expert): {cat, expert}

Cat

animal An expert

Page 6: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

What is Wordnet

Page 7: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Wordnet

• A lexical knowledgebase based on conceptual lookup

• Organizing concepts in a semantic network.Organizing concepts in a semantic network.

• Organize lexical information in terms of word meaning, rather than word form

• Wordnet can also be used as a thesaurus.Wordnet can also be used as a thesaurus.

Page 8: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Psycholinguistic Theory • Human lexical memory for nouns as a hierarchy.• Can canary sing? - Pretty fast response.• Can canary fly? - Slower response.• Does canary have skin? – Slowest response.

(can move, has skin)

(can fly)

(can sing)

Wordnet - a lexical reference system based on psycholinguistic theories of human lexical memory.

Animal

Bird

canary

Page 9: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Lexical Matrix

Page 10: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Wordnet - Lexical Matrix (with examples)

Word MeaningsWord Forms

F1 F2 F3 … Fn

M1

(depend)E1,1

(bank)E1,2

(rely)E1,3

M2

(bank)E2,2

(embankment)

E2,…

M3

(bank)E3,2 E3,3

… …

Mm Em,n

Page 11: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Wordnet: International Scenario

• Wordnet is a network of words linked by lexical and semantic relations.

• The first wordnet in the world was for English developed at Princeton over 15 years.

• The Eurowordnet- linked structure of European language wordnets was built in 1998 over 3 years with funding from the EC as a a mission mode project.

• Wordnets for Hindi and Marathi being built at IIT Bombay are amongst the first IL wordnets.

• All these are proposed to be linked into the IndoWordnet which eventually will be linked to the English and the Euro wordnets.

Page 12: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Hindi Wordnet

Dravidian Language Wordnets

North East Language Wordnet

Marathi Wordnet

Sanskrit Wordnet

EnglishWordnet

Bengali Wordnet

Punjabi Wordnet

KonkaniWordnet

Linked Wordnets in India

Page 13: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Great Linguistic Diversity• Major streams

– Indo European– Dravidian– Sino Tibetan– Austro-Asiatic

• Some languages are ranked within 20 in the world in terms of the populations speaking them– Hindi and Urdu: 5th (~500 milion)– Bangla: 7th (~300 million)– Marathi 14th (~70 million)

Page 14: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Major Language Processing Initiatives

• Mostly from the Government: Ministry of IT, Ministry of Human Resource Development, Department of Sceince and Technology

• Recently great drive from the industry: NLP efforts with Indian language in focus– Google– Microsoft– IBM Research Lab– Yahoo – TCS

Page 15: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Fundamental Design Question

• Syntagmatic vs. Paradigmatic realtions?• Psycholinguistics is the basis of the design.• When we hear a word, many words come to our

mind by association.• For English, about half of the associated words are

syntagmatically related and half are paradignatically related.

• For cat– animal, mammal- paradigmatic– mew, purr, furry- syntagmatic

Page 16: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Stated Fundamental Application of Wordnet: Sense Disambiguation

Determination of the correct sense of the wordThe crane ate the fish vs.The crane was used to lift the load

bird vs. machine

Page 17: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

The problem of Sense tagging

• Given a corpora To Assign correct sense to the words.

• This is sense tagging. Needs Word Sense Disambiguation (WSD)

• Highly important for Question Answering, Machine Translation, Text Mining tasks.

Page 18: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Basic Principle

• Words in natural languages are polysemous.• However, when synonymous words are put together,

a unique meaning often emerges.• Use is made of Relational Semantics.• Componential Semantics where each word is a

bundle of semantic features (as in the Schankian Conceptual Dependency system or Lexical Componential Semantics) is to be examined as a viable alternative.

Page 19: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Componential Semantics

• Consider cat and tiger. Decide on componential attributes.

• For cat (Y, Y, N, Y)• For tiger (Y,Y,Y,N) Complete and correct

Attributes are difficult to design.

FurryFurry CarnivoroCarnivorousus

HeavyHeavy DomesticaDomesticableble

Page 20: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Semantic relations in wordnet1. Synonymy2. Hypernymy / Hyponymy3. Antonymy4. Meronymy / Holonymy5. Gradation6. Entailment 7. Troponymy1, 3 and 5 are lexical (word to word), rest are semantic

(synset to synset).

Page 21: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synset: the foundation(house)

1. house -- (a dwelling that serves as living quarters for one or more families; "he has a house on Cape Cod"; "she felt she had to get out of the house")2. house -- (an official assembly having legislative powers; "the legislature has two houses")3. house -- (a building in which something is sheltered or located; "they had a large carriage house")4. family, household, house, home, menage -- (a social unit living together; "he moved his family to Virginia"; "It was a good Christian household"; "I waited until the whole house was asleep"; "the teacher asked how many people made up his home")5. theater, theatre, house -- (a building where theatrical performances or motion-picture shows can be presented; "the house was full")6. firm, house, business firm -- (members of a business organization that owns or operates one or more establishments; "he worked for a brokerage house")7. house -- (aristocratic family line; "the House of York")8. house -- (the members of a religious community living together)9. house -- (the audience gathered together in a theatre or cinema; "the house applauded"; "he counted the house")10. house -- (play in which children take the roles of father or mother or children and pretend to interact like adults; "the children were playing house")11. sign of the zodiac, star sign, sign, mansion, house, planetary house -- ((astrology) one of 12 equal areas into which the zodiac is divided)12. house -- (the management of a gambling house or casino; "the house gets a percentage of every bet")

Page 22: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synset: DSF format (1/2)• Synset ID: a unique number identifying a synset • Category: POS category of the words • Concept: The part of the gloss that gives a brief summary

of what the synset represents • Example: One or more examples of the words in the

synset being used in sentences • Synset: The set of synonymous words comprised in the

synset

Page 23: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synset - DSF format (2/2)

ID :: 121CATEGORY :: NOUNCONCEPT :: अपने� से� छो�टों के� प्रति� हृदय में� उठने�वा�ला�

प्र�मेंEXAMPLE :: “चा�चा� ने�हरू के� बच्चों से� बहु� ह� स्ने�ह

था�”SYNSET :: स्ने�ह,ने�ह,लागा�वा,मेंमें��

Page 24: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Creation of Synsets

Three principles:• Minimality• Coverage• Replacability

Page 25: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synset creation (continued)

HomeJohn’s home was decorated with lights on the occasion of Christmas.Having worked for many years abroad, John Returned home.

HouseJohn’s house was decorated with lights on the occasion of Christmas.Mercury is situated in the eighth house of John’s horoscope.

Page 26: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synsets (continued)

{house} is ambiguous.{house, home} has the sense of a social unit living together;Is this the minimal unit?{family, house , home} will make the unit completely

unambiguous.

For coverage:{family, household, house, home} ordered according to

frequency.

Replacability of the most frequent words is a requirement.

Page 27: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synset creation

From first principles– Pick all the senses from good standard

dictionaries.– Obtain synonyms for each sense.– Needs hard and long hours of work.

Page 28: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Synset creation (continued)

From the wordnet of another language in the same family– Pick the synset and obtain the sense from the gloss.– Get the words of the target language.– Often same words can be used- especially for t%sama

words.– Translation, Insertion and deletion.

Hindi Synset: AnauBavaI jaanakar maMjaa huAa (experienced person)

Marathi Synset: AnauBavaI t& jaaNata &ata

Page 29: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Gloss and ExampleCrucially needed for concept explication, wordnet building using another

wordnet and wordnet linking.

{earthquake, quake, temblor, seism} -- (shaking and vibration at the surface of the earth resulting from underground movement along a fault plane of from volcanic activity)

Page 30: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Semantic Relations

• Hypernymy and Hyponymy– Relation between word senses (synsets)– X is a hyponym of Y if X is a kind of Y– Hyponymy is transitive and asymmetrical– Hypernymy is inverse of Hyponymy

(lion->animal->animate entity->entity)(lion->animal->animate entity->entity)

Page 31: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Semantic Relations (continued)

• Meronymy and Holonymy– Part-whole relation, branch is a part of tree– X is a meronymy of Y if X is a part of Y– Holonymy is the inverse relation of Meronymy{kitchen} ………………………. {house}

Page 32: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Lexical Relation

• Antonymy– Oppositeness in meaning – Relation between word forms– Often determined by phonetics, word length etc.

({rise, ascend} vs. {fall, descend})

Page 33: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Troponym and Entailment

• Entailment {snoring – sleeping}

• Troponym {limp, strut – walk} {whisper – talk}

Page 34: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Entailment.

Snoring entails sleeping.Snoring entails sleeping.Buying entails paying.Buying entails paying.

• Proper Temporal Inclusion. Inclusion can be in any way.

Sleeping temporally includes snoring.Sleeping temporally includes snoring.Buying temporally includes paying.Buying temporally includes paying.

• Co-extensiveness. (Troponymy)Limping is a manner of walking. Limping is a manner of walking.

Page 35: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Opposition among verbs.

• {Rise,ascend} {fall,descend} Tie-untie (do-undo)

Walk-run (slow,fast)Teach-learn (same activity different perspective)Rise-fall (motion upward or downward)

• Opposition and Entailment. Hit or miss (entail aim) . Backward presupposition. Succeed or fail (entail try.)

Page 36: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

The causal relationship.

Show- see.Give- have.

Causation and Entailment. Giving entails having. Feeding entails eating.

Page 37: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay
Page 38: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Kinds of AntonymySizeSize Small - BigSmall - BigQualityQuality Good Good –– Bad BadStateState Warm Warm –– Cool CoolPersonalityPersonality Dr. Jekyl- Mr. HydeDr. Jekyl- Mr. HydeDirectionDirection East- WestEast- WestActionAction Buy Buy –– Sell SellAmountAmount Little Little –– A lot A lotPlacePlace Far Far –– Near NearTimeTime Day - NightDay - NightGenderGender Boy - GirlBoy - Girl

Page 39: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Kinds of MeronymyComponent-Component-objectobject

Head - BodyHead - Body

Staff-objectStaff-object Wood - TableWood - TableMember-Member-

collectioncollectionTree - ForestTree - Forest

Feature-ActivityFeature-Activity Speech - Speech - ConferenceConference

Place-AreaPlace-Area Palo Alto - Palo Alto - CaliforniaCalifornia

Phase-StatePhase-State Youth - LifeYouth - LifeResource-Resource-

processprocessPen - WritingPen - Writing

Actor-ActActor-Act Physician - Physician - TreatmentTreatment

Page 40: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Gradation

StateState Childhood, Youth, Old Childhood, Youth, Old ageage

TemperatureTemperature Hot, Warm, ColdHot, Warm, Cold

ActionAction Sleep, Doze, WakeSleep, Doze, Wake

Page 41: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Gloss

study

Hyponymy

Hyponymy

Dwelling,abode

bedroom

kitchen

house,home

A place that serves as the living quarters of one or mor efamilies

guestroom

veranda

bckyard

hermitage cottage

Meronymy

Hyponymy

Meronymy

Hypernymy

WordNet Sub-Graph (English)

Page 42: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

गा�य, गाऊ (gaaya ,gauu) Cow

चा$प�य�,पशु'(chaupaayaa, pashu)Four-legged animal

से(गावा�ला� एके शु�के�ह�री� में�द� चा$प�य�(siingwaalaa eka sakaahaarii maadaa choupaayaa)A horny, herbivorous, four-legged female animal)

पगा'री�ने� ( paguraanaa) ruminate

ब+ला (baila) Ox

के�मेंधे�ने'kaamadhenuA kind of cow

में+ने� गा�यmainii gaayaA kind of cow

थाने (thana) udder

प-.छो(puunchh ) Tail

शु�के�ह�री� (shaakaahaarii) herbivorous

Hypernym

Attribute

Hyponym

Gloss

Ability Verb

meronym

Antonym

WordNet Sub-Graph: Hindi

Page 43: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Wordnet Subgraph (Marathi)

खो�ड

री�ने

ब�गा

आं2ब�लिं45ब-

में -ूळ

में'ळ� ,खो�ड,फां�2द्या�,प�ने� इत्य�द<ने� य'क्त असे� वानेस्पति�तिवाशु�ष:"झा�ड� पय�Aवारीण शु'द्ध केरीण्य�चा� के�में केरी���"

झा�ड, वाEक्ष, �रू

वानेस्प��

MERONYMY

HOLONYMY

H Y P E R N Y M Y

H Y P O N Y M YGLOSS

Page 44: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Pan-India Dictionary StandardSenses Hindi Marathi Bangali Oriya Tamil

(W1, W2, W3, W4, W5, W6 )

(W1, W2, W3, W4, W5, W6 )

(W1, W2, W3) (W1, W2 , W3) (W1, W2, W3, W4)

(W1, W2, W3)

(sun) (सू�र्य�, सू�रज, भा�नु, भा�स्कर, प्रभा�कर,

दि�नुकर, अं�शुमा�नु, अं�शुमा�ली�) (सू�र्य�, भा�नु, दि�वा�कर, भा�स्कर, रविवा, दि�नु�शु, दि�नुमाणी�) ... ... ...

(cub, lad, laddie, sonny, sonny boy)

(लीड़क�, बा�लीक, बाच्चा�, छो�कड़�, छो�र�, छो�कर�, ली�डा� )

(मालीगा�, पो�रगा�, पो�र, पो�रगा� ) … … …

(son, boy) (पोत्र, बा�टा�, लीड़क�, ली�ली, सूत, बाच्चा�, नु��नु, पो�त, चि'र�ज�वा, चि'र�ज� )

(मालीगा�, पोत्र, ली�क, चि'र�ज�वा, तनुर्य ) … … …

Page 45: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Sanskrit Wordnet: a new effort- A column in the Concept based Multilingual dictionary

Concepts L1 (English) L2 (Hindi) L3 (Sanskrit)

Concept ID: Concept description

(W1, W2, W3, ..) (W4, W5, W6, ..) (W7, W8, W9, ..)

4066: any of various long-tailed primates (excluding the prosimians)

(monkey)(ब2दरी, बन्दरी, ब�नेरी,

वा�नेरी, केHशु, केतिप, मेंकेA टों, ..)

(वा�नेरीI, केतिपI, प्लावाङ्गःI, प्लावागाI, शु�खो�मेंEगाI, वाला�में'खोI, मेंकेA टोंI, ..)

2186: a typical star that is the source of light and heat for the planets in the solar system

(sun)

(से-यA,से-रीज, भा�ने', दिदवा�केरी, भा�स्केरी, प्रभा�केरी, दिदनेकेरी, रीतिवा, ..)

(से-यAI, सेतिवा��, आंदिदत्यI, मिमेंत्रःI, अरुणI, भा�ने'I, प-ष�, अकेA I, ..)

Page 46: CS626/449 : Speech, NLP and the Web/Topics in AI Programming (Lecture 4: Word Sense Disambiguation; Wordnet) Pushpak Bhattacharyya CSE Dept., IIT Bombay

Summary

• Synsets: basic units• Principles of creation: minimality, coverage,

replaceability• Semantic relations (main ones): hypernymy

(is-a), meronymy (part-of), antomymy, troponymy (manner-of)