23- november-091 wordnet and extended wordnet sriram rajaraman

Download 23- November-091 WordNet and Extended WordNet Sriram Rajaraman

Post on 05-Jan-2016

220 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • WordNet and Extended WordNet

    Sriram Rajaraman

    WordNet and eXtended WordNet

    ObjectiveIntroduce the idea of an semantic lexicon ontology, especially WordNet and eXtended WordNet

    WordNet and eXtended WordNet

    FocusIntroductionWordNet eXtended WordNetSummary

    WordNet and eXtended WordNet

    ReferenceWordNet: http://wordnet.princeton.edu/eXtended WordNet: http://xwn.hlt.utdallas.edu/ Christiane Fellbaum,MIT ,WordNet : an electronic lexical database, MIT Press, 1999, c1998. George A. Miller, Richard Beckwith, Christiane Fellbaum,Derek Gross, and Katherine Miller, Introduction to WordNet: An On-line Lexical Database, core working paperRada Mihalcea, Dan I. Moldovan, eXtended WordNet: progress report Proceedings of NAACL Workshop on WordNet and Other Lexical Resources , 2001Sanda M. Harabagiu, George A. Miller, Dan I. Moldovan, WordNet 2 - A Morphologically and Semantically Enhanced Resource, SIGLEX 1999

    WordNet and eXtended WordNet

    FocusIntroductionWordNet eXtended WordNetSummary

    WordNet and eXtended WordNet

    IntroductionTraditional DictionaryWhat is available: spellingpronunciationinflected and derivative forms etymologypart of speech definitions illustrative uses of alternative senses synonyms and antonyms special usage notes

    WordNet and eXtended WordNet

    TreeRef: http://www.merriam-webster.com/dictionary/TreeMain Entry: tree Pronunciation: \tr\Function: noun Etymology: Middle English, from Old English trow; akin to Old Norse tr tree, Greek drys, Sanskrit dru woodDate: before 12th century- a woody perennial plant having a single usually elongate main stem generally with few or no branches on its lower part

    WordNet and eXtended WordNet

    Drawback of traditional dictionaryWhat is missing:It does not say, for example, that trees have roots, or that they consist of cells having cellulose walls, or even that they are living organismsSense of the super ordinate term aka hypernym (living plant or industrial plant) Coordinate terms (bushes, shrubs, )Hyponyms - types of trees (pine, tropical,deciduous..)Information assumed to be known to everyone ( trees have barks and leaves, they grow from seeds, they make their own food by photosynthesis- probably information for encyclopedia!)

    WordNet and eXtended WordNet

    How can we improve ?The missing information is structural every word points upwards to its super-ordinate (hypernym), but not sideward to its co-ordinates or downward to the hyponym.Restriction due to alphabetical ordering, budget and size constraints- which can be overcome in an electronic lexical database

    WordNet and eXtended WordNet

    Focus IntroductionWordNet eXtended WordNetSummary

    WordNet and eXtended WordNet

    What is WordNet?WordNet is a lexical database for the English language.

    WordNet 3.0 has [1]: 117,097 nouns (average noun has 1.23 senses) 11,488 verbs (average verb has 2.16 sense) 22,141 adjectives 4,601 adverbs

    Created and maintained at the Cognitive Science Laboratory of Princeton University

    Accessible online @http://wordnetweb.princeton.edu/perl/webwn(Also Downloadable)

    Interfaces available in , c, dot Net , java, perl, php, python, sql etc..(JWNL, WordNet.Net, RTiA wordNet, pywordne ..)

    WordNet and eXtended WordNet

    WordNet StructureWords are organized as synsets in WordNetThere are four disjoint kinds of synsets, containing either NounsverbsAdjectivesAdverbs

    WordNet and eXtended WordNet

    What is a synset?Basic unit of WordNetA group of synonymous words which refer to a common semantic concept Words may belong to more than one synset first sense is the most frequent senseWords also include collocations (eye contact, mix up)Example

    WordNet and eXtended WordNet

    Synset examplecar as in {car, auto, automobile, machine, motorcar} {car, railcar, railway car, railroad car}.Chocolate as in-

    WordNet and eXtended WordNet

    How are synsets related?A list of pointers associated with each sysnet to express the relationship between synsetsWordNet defines 17 relations10 between synsets5 between wordsense"gloss" (between a synset and a sentence, i.e a textual definition for each synset)"frame" (between a synset and a verb construction pattern)

    WordNet and eXtended WordNet

    WordNet relations

    WordNet and eXtended WordNet

    WordNet and eXtended WordNet

    Applications of WordNetInformation ExtractionInformation RetreivalQuestion AnsweringWord Sense DisambiguationText InferenceCoreference, coherence and metonymyKnowledge acquisitionInternet Search engine

    WordNet and eXtended WordNet

    Limitations of WordNetDesigned as a semantic lexicon, not a knowledge baseLimited connections between topically related wordsLack of morphological relationship(special algorithm does that)Lack of selectional restrictionAnd more. [6]

    WordNet and eXtended WordNet

    FocusIntroductionWordNet eXtended WordNetSummary

    WordNet and eXtended WordNet

    eXtended WordNet[2]A project at the Human Language Technology Research Institute , at The University of Texas at Dallas(http://xwn.hlt.utdallas.edu)

    Provides several important enhancements (over WordNet2.0) intended to remedy the present limitations of WordNet

    Current Version: eXtended WordNet 2.0 (xwn 2.0-1.1)

    WordNet and eXtended WordNet

    Objective of eXtended WordNetExploit the rich information, available in synset glosses (gloss is a sentence, i.e a textual definition for each synset)Semantic and logical enhancements to WordNetIncrease the connectivity among the synsets by at least one order of magnitudeEnable access to a broader context for each concept

    WordNet and eXtended WordNet

    What eXtended WordNet does?[5]Preprocessing and ParsingSeparation of glosses into definition and examples, tokenization and identification of compound wordsWord Sense DisambiguationAll words in a gloss is tagged with appropriate senses and linked to corresponding synsets Logical Form TransformationGloss Logical FormsTopical Relations Connections are established between the words, based on the context/topic

    WordNet and eXtended WordNet

    Extended WordNet

    WordNet and eXtended WordNet

    eXtended WordNet formatConsists of four XML files--one for each part of speech:NounVerbAdjectiveAdverbThe xml tags contains attributes that specify the relationships

    WordNet and eXtended WordNet

    eXtended WordNet- ApplicationsCore Knowledge Base for applications - Question Answering Information Retrieval Information Extraction Summarization Natural Language Generation Inferences Other knowledge intensive applications

    WordNet and eXtended WordNet

    FocusIntroductionWordNet eXtended WordNetSummary

    WordNet and eXtended WordNet

    Further ReadingW3C- RDF/OWL Representation of WordNethttp://www.w3.org/TR/wordnet-rdf/eXtended WordNet Format/algorithmhttp://xwn.hlt.utdallas.edu/wsd.htmlCurrent research at Princetonhttp://wordnet.cs.princeton.edu/projects.htmlRelated Projects (APIs, Web Interface, Extension)http://wordnet.princeton.edu/wordnet/related-projects/

    WordNet and eXtended WordNet

    Back up

    WordNet and eXtended WordNet

    WordNet StatisticsRef:http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html

    Sheet1

    Unique StringsSynstesWord-Sense Pairs

    Noun11709781426145104

    Verb114881365024890

    Adjective221411887731302

    Adverb460136445720

    Totals155327117597207016

    Sheet2

    Sheet3

    WordNet and eXtended WordNet

    WordNet relations-0Relations between synsets: Synonymy Hypernymy (superordination) Hyponymy (subordination) Holonymy (whole to part relation) Meronymy (part to whole relation) Antonymy Troponymy (particular way to do something)

    WordNet and eXtended WordNet

    WordNet relations-1Antonymy relation: (sweet) Definition: having a pleasant taste (as of sugar) Has the antonym: (sour) Definition: having a sharp biting taste. Troponymy relation: (dream) Definition: experience while sleeping. Has the troponym: (fantasize) Definition: have fantasies.

    WordNet and eXtended WordNet

    WordNet relations-2 Synonymy relation: (motor vehicle, automotive vehicle) Definition: a self propelled wheeled vehicle that does notrun on rails. Hypernymy relation: (vehicle) Definition: a conveyance that transports people orobjects. Hyponymy relation: (ambulance) Definition: a vehicle that takes people to and fromhospitals

    WordNet and eXtended WordNet

    WordNet relations-3 Holonymy relation: (bicycle wheel) Definition: the wheel of a bicycle Has the holonym: (bicycle, bike, wheel) Definition: has two wheels; moved by foot pedals Meronymy relation: (bicycle wheel) Definition: the wheel of a bicycle Has the meronym: (spoke, radius) Definition: a radial member of a wheel joining the hub tothe rim.

    WordNet and eXtended WordNet

    WordNet relations

    WordNet and eXtended WordNet

    Example: limb

    WordNet and eXtended WordNet

    WordNet Task ForceAims to support the deployment in RDF/OWL of WordNet Proposes inclusion of RDF or OWL versions of wordnets and lexical ontologies into the official distributions Integrating existing datamodels in order to provide a unified OWL vocabulary for RDF versions of wordnets. Distilling the most agreed-upon parts of practices for developing ontologies out of wordnets, and including them in a set of recommendations.

    WordNet and eXtended WordNet

    Conversion (contd)g(Synset_ID,Gloss).

    The g operator specifies the gloss for a synset.

    Gloss is a string.

    Maps to: wn:gloss(Synset_ID, Gloss)

    hyp(Synset_ID_A,Synset_ID_B).

    The hyp operator specifies that the second synset is a hypernym of the first synset.

    This relation holds for nouns and verbs. The reflexive operator, hyponym, implies that the first synset is a hyponym of the second syns

Recommended

View more >