framester: a wide coverage linguistic linked data hub

30
Framester: A Wide Coverage Linguistic Linked Data Hub. Aldo Gangemi 1 , Mehwish Alam 1 , Luigi Asprino 2,3 , Valentina Presutti 3 , Diego Reforgiato Recupero 4 1. Universite Paris 13, Paris, France, 2. University of Bologna, Bologna, Italy 3. National Research Council (CNR), Rome, Italy 4. University of Cagliari, Cagliari, Italy 23 rd November 2016 International Conference on Knowledge Engineering and Knowledge Management, 2016 1 / 23

Upload: mehwish-alam

Post on 13-Apr-2017

138 views

Category:

Education


0 download

TRANSCRIPT

Framester: A Wide Coverage Linguistic Linked Data Hub.

Aldo Gangemi1, Mehwish Alam1, Luigi Asprino2,3, Valentina Presutti3, DiegoReforgiato Recupero4

1. Universite Paris 13, Paris, France,2. University of Bologna, Bologna, Italy

3. National Research Council (CNR), Rome, Italy4. University of Cagliari, Cagliari, Italy

23rdNovember 2016International Conference on Knowledge Engineering and Knowledge Management, 2016

1 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Existing State of LLD

Figure: Current state of Linguistic Linked Data and connections to other resources. Blue, red,green and yellow color represent role-oriented lexical resources, fact-oriented data, wordnet-likelexical resources and ontology schemas respectively.

2 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Framester - Linguistic Linked Data Hub

Figure: Framester Cloud. Red color represents the main hub i.e., Framester, Purple representsthe links to data sets for Sentiment Analysis. Black and orange arrows represent the existing andFramester specific links respectively.

3 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

FrameNet

4 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

FrameNet

a

4 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

FrameNet

4 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Frames as Eventuality Schema

Hagrid rolled up a note

rollpHagrid , noteq

Hagrid rolled up a note for Harry in Hogwarts

rollpHagrid , note,Harry ,Hogwartsqnote: ambiguity in the arguments of predicate i.e., Harry is a person and Hogwarts is a location

adding eventualities (Neo Davidsonian events)

rollpe,Hagrid , note,Harry ,Hogwartsq

adding semantic roles

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq

adding semantic types

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq ^ PersonpHagridq...

5 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Frames as Eventuality Schema

Hagrid rolled up a note

rollpHagrid , noteq

Hagrid rolled up a note for Harry in Hogwarts

rollpHagrid , note,Harry ,Hogwartsqnote: ambiguity in the arguments of predicate i.e., Harry is a person and Hogwarts is a location

adding eventualities (Neo Davidsonian events)

rollpe,Hagrid , note,Harry ,Hogwartsq

adding semantic roles

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq

adding semantic types

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq ^ PersonpHagridq...

6 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Frames as Eventuality Schema

Hagrid rolled up a note

rollpHagrid , noteq

Hagrid rolled up a note for Harry in Hogwarts

rollpHagrid , note,Harry ,Hogwartsqnote: ambiguity in the arguments of predicate i.e., Harry is a person and Hogwarts is a location

adding eventualities (Neo Davidsonian events)

rollpe,Hagrid , note,Harry ,Hogwartsq

adding semantic roles

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq

adding semantic types

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq ^ PersonpHagridq...

7 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Frames as Eventuality Schema

Hagrid rolled up a note

rollpHagrid , noteq

Hagrid rolled up a note for Harry in Hogwarts

rollpHagrid , note,Harry ,Hogwartsqnote: ambiguity in the arguments of predicate i.e., Harry is a person and Hogwarts is a location

adding eventualities (Neo Davidsonian events)

rollpe,Hagrid , note,Harry ,Hogwartsq

adding semantic roles

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq

adding semantic types

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq ^ PersonpHagridq...

8 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Frames as Eventuality Schema

Hagrid rolled up a note

rollpHagrid , noteq

Hagrid rolled up a note for Harry in Hogwarts

rollpHagrid , note,Harry ,Hogwartsqnote: ambiguity in the arguments of predicate i.e., Harry is a person and Hogwarts is a location

adding eventualities (Neo Davidsonian events)

rollpe,Hagrid , note,Harry ,Hogwartsq

adding semantic roles

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq

adding semantic types

rollpe,Hagrid , note,Harry ,Hogwartsq ^ agentpe,Hagridq ^ themepe, noteq ^recipientpe,Harryq ^ locationpe,Hogwartsq ^ PersonpHagridq...

9 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

N-ary Relations in RDF

Original RDF Triple : e1 p e2e1 e2

p

10 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

N-ary Relations in RDF

Original RDF Triple : e1 p e2

Triple with reified eventuality:

e rdf:subject e_1 .

e rdf:predicate p .

e rdf:object e_2 .

e1 e2

e

p

10 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

N-ary Relations in RDF

Original RDF Triple : e1 p e2

Triple with reified eventuality:

e rdf:subject e_1 .

e rdf:predicate p .

e rdf:object e_2 .

e1 e2

e

p

event type Reshaping .

event agent Hagrid .

event patient note .

event recipient Harry .

event location Hogwarts .

Reshaping

Hagrid

event note

Harry

Hogwarts

type

agent

patient

recipient

location

10 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Framester Schema

Situation FrameOccurence Framester Role

Description FrameClass FrameProjection R1

r1.f2

r2.f1 f1 r1.f1

f2 r3.f1

hasFrameProj.

subClassOf

occurrenceOf

subClassOf

inheritsFrom

subsumedUnder+

frame instance

role instance

optional

subsumedUnder

necessary

external

11 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Framester Frames Representation

12 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Framester Frames Representation

12 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Improving FrameNet Coverage

WordNet/BabelNet-FrameNet Extensions

FrameNet Original Mappings (Profile-F)

Framester Base (Profile-B)eXtended WordFrameNet [de Lacalle et al., 2014]FrameBase [Rouces et al., 2015]

WN/BNSynset FrameNet

eXtended WordFrameNetFrameBase

13 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Improving FrameNet Coverage

WordNet/BabelNet-FrameNet Extensions

FrameNet Original Mappings (Profile-F)

Framester Base (Profile-B)eXtended WordFrameNet [de Lacalle et al., 2014]FrameBase [Rouces et al., 2015]

DirectX (Profile-D)

Synset Synset Synset

WN/BN Synset FrameNet

Synset Synset Synset

eXtended WordFrameNetFrameBase

partainym Instance verb-group

participle hyponym adj-sim

13 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Improving FrameNet Coverage

WordNet/BabelNet-FrameNet Extensions

FrameNet Original Mappings (Profile-F)

Framester Base (Profile-B)eXtended WordFrameNet [de Lacalle et al., 2014]FrameBase [Rouces et al., 2015]

DirectX (Profile-D)

TransX (Profile-T)

Sysnset Sysnset Sysnset

WN/BN Synset FrameNet

Synset Sysnset Sysnset

Sysnset

eXtended WordFrameNetFrameBase

partainym Instance verb-group

participle hyponym adj-sim

hyponym

13 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Links to Other Resources

WordNet Synset FrameNet

Resource URI/OWL-Class

skos:closeMatch

some-relation

DOLCE-Zero [Nuzzolese et al., 2012]

[ a framenet:fnwnd0Detour ;

framenet:forSynset wn30instances:synset-fireplace-noun-1 ;

framenet:hasFoundational d0:Location ;

framenet:onFrame frame:Architectural_part ] .

14 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Word Frame Disambiguation

Framework for Frame Detection.

Tokenisation, POS tagging, lemmatization, word sense disambiguation, and framedetection by detour.

WSD algorithms:Babelfy [Moro et al., 2014]UKB [Agirre and Soroa, 2009]

Mapping between word senses and are matched against Word sense to Framealignment based on 4 profiles.

Example

15 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Word Frame Disambiguation

Framework for Frame Detection1.

Tokenisation, POS tagging, lemmatization, word sense disambiguation, and framedetection by detour.

WSD algorithms:Babelfy [Moro et al., 2014]UKB [Agirre and Soroa, 2009]

Mapping between word senses and are matched against Word sense to Framealignment based on 4 profiles.

1http://lipn.univ-paris13.fr/framester/16 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Experiment#1: FrameNet Corpus

WFD-API was used for evaluation purposes.

FrameNet annotated corpus v1.5 (gold standard)

78 documents with 170,000 manually annotated sentences,

UKB Babelfy

Framester Profiles Recall Precision F1 New Annotations Recall Precision F1 New AnnotationseXtended WFN 0.511 0.810 0.627 832 0.580 0.820 0.680 8129FrameBase 0.719 0.714 0.716 1132 0.621 0.71 0.661 11035Profile-F 0.688 0.777 0.702 1148 0.673 0.749 0.704 10962Profile-B 0.671 0.799 0.729 1251 0.662 0.780 0.715 11661Profile-D 0.750 0.641 0.690 1929 0.790 0.569 0.660 20382Profile-T 0.860 0.520 0.648 2728 0.870 0.444 0.588 26108

Table: Results

Recall Precision F1 ´Measure New AnnotationsSemafor 0.76 0.96 0.85 16520

Table: Results for the baseline (Semafor).

17 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Experiment#2: Independent Corpus

Independent Corpus of 100 heterogeneous texts taken from New York Times news,tweets, Wikipedia definitions, and scientific articles.

annotation using Framester profiles and semafor.

2 experts judged the correctness of the detected frames and missing detection

Judgements: Valid, Metaphorical2, or Invalid

inter-rater agreement using weighted Cohen’s K (WKAPPA) (value 0.532)

Third expert to take decisions when the two raters had different opinions.

Precision Recall F1

eXtended WFN 0.770 0.277 0.523FrameBase 0.703 0.359 0.531Profile-B 0.776 0.366 0.571Profile-D 0.705 0.622 0.663Profile-T 0.644 0.781 0.713Profile-F 0.750 0.377 0.564Semafor 0.794 0.334 0.564

Table: Results

2Frame Travel in Our love traveled distances18 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

Conclusion

more than 40 million triples including new LOD versions of many, linguistic/factualresources, and links among them, and to Framester

formal schema interoperability across datasets

full revision of WordNet-FrameNet mappings

new semantic role taxonomy from localised roles way up to abstract roles anddependencies

alignment of frames, roles and types to foundational ontologies

Word Frame Disambiguation service

Ongoing Work:

Frame vectors (frame2vec).

Frame clustering and complex frame discovery.

Semantic Relatedness between Frames.

19 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

References I

Agirre, E. and Soroa, A. (2009).Personalizing PageRank for Word Sense Disambiguation.In Lascarides, A., Gardent, C., and Nivre, J., editors, EACL 2009, 12th Conferenceof the European Chapter of the Association for Computational Linguistics,Proceedings of the Conference, Athens, Greece, March 30 - April 3, 2009, pages33–41. The Association for Computer Linguistics.

Baccianella, S., Esuli, A., and Sebastiani, F. (2010).SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis andOpinion Mining.In Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S.,Rosner, M., and Tapias, D., editors, Proceedings of the International Conference onLanguage Resources and Evaluation, LREC 2010, 17-23 May 2010, Valletta, Malta.European Language Resources Association.

20 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

References II

Cuadros, M., Padro, L., and Rigau, G. (2012).Highlighting relevant concepts from topic signatures.In Calzolari, N., Choukri, K., Declerck, T., Dogan, M. U., Maegaard, B., Mariani, J.,Odijk, J., and Piperidis, S., editors, Proceedings of the Eighth InternationalConference on Language Resources and Evaluation (LREC-2012), Istanbul, Turkey,May 23-25, 2012, pages 3841–3848. European Language Resources Association(ELRA).

Das, D., Chen, D., Martins, A. F. T., Schneider, N., and Smith, N. A. (2014).Frame-semantic parsing.Computational Linguistics, 40(1):9–56.

de Lacalle, M. L., Laparra, E., and Rigau, G. (2014).Predicate Matrix: extending SemLink through WordNet mappings.In Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J.,Moreno, A., Odijk, J., and Piperidis, S., editors, Proceedings of the NinthInternational Conference on Language Resources and Evaluation (LREC-2014),Reykjavik, Iceland, May 26-31, 2014., pages 903–909. European LanguageResources Association (ELRA).

21 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

References III

Gangemi, A. and Mika, P. (2003).Understanding the semantic web through descriptions and situations.In [Meersman et al., 2003], pages 689–706.

Gangemi, A., Presutti, V., Recupero, D. R., Nuzzolese, A. G., Draicchio, F., andMongiovi, M. (2016).Semantic Web Machine Reading with FRED.Semantic Web.

Meersman, R., Tari, Z., and Schmidt, D. C., editors (2003).On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE -OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003,Catania, Sicily, Italy, November 3-7, 2003, volume 2888 of Lecture Notes inComputer Science. Springer.

Moro, A., Raganato, A., and Navigli, R. (2014).Entity Linking meets Word Sense Disambiguation: a Unified Approach.Transactions of the Association for Computational Linguistics (TACL), 2:231–244.

22 / 23

Linguistic Linked Data (LLD) Framester Semantics Resource Generation Experimentation Conclusion

References IV

Nuzzolese, A. G., Gangemi, A., Presutti, V., Ciancarini, P., and Musetti, A. (2012).Automatic Typing of DBpedia Entities.In Proc. of the International Semantic Web Conference (ISWC), Boston, MA, US.

Rouces, J., de Melo, G., and Hose, K. (2015).Framebase: Representing n-ary relations using semantic frames.In Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudre-Mauroux, P., andZimmermann, A., editors, The Semantic Web. Latest Advances and New Domains -12th European Semantic Web Conference, ESWC 2015, Portoroz, Slovenia, May 31- June 4, 2015. Proceedings, volume 9088 of Lecture Notes in Computer Science,pages 505–521. Springer.

Staiano, J. and Guerini, M. (2014).Depeche Mood: a Lexicon for Emotion Analysis from Crowd Annotated News.In Proceedings of the 52nd Annual Meeting of the Association for ComputationalLinguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 2: ShortPapers, pages 427–433. The Association for Computer Linguistics.

23 / 23