question answering based on semantic graphs

Post on 22-Feb-2016

47 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Question Answering Based on Semantic Graphs. Lorand Dali – lorand.dali@ijs.si Delia Rusu – delia.rusu@ijs.si Bla ž Fortuna – blaz.fortuna@ijs.si Dunja Mladeni ć – dunja.mladenic@ijs.si Marko Grobelnik – marko.grobelnik@ijs.si. Overview. Motivation System Overview Question Answering - PowerPoint PPT Presentation

TRANSCRIPT

Question Answering Based on Semantic Graphs

Lorand Dali – lorand.dali@ijs.siDelia Rusu – delia.rusu@ijs.siBlaž Fortuna – blaz.fortuna@ijs.siDunja Mladenić – dunja.mladenic@ijs.siMarko Grobelnik – marko.grobelnik@ijs.si

MotivationSystem OverviewQuestion AnsweringDocument Overview

FactsSemantic GraphDocument Summary

Conclusions

Overview

Motivation

Motivation

TripletsFacts stated in the textThe core of the sentence (subject, verb, object)

System Overview

Extract facts (triplets) from textIndex triplets to enable structured search on themAnalyze questions to obtain the queries for the triplet searchRetrieve the answer and the document containing itBrowse the document overview

Question Answering

Question Answering

Question types:Yes/No questions (Do animals eat fruit?),list questions (What do animals eat?),reason questions (Why do animals eat fruit?),quantity questions (How much fruit do animals eat?),location questions (Where do animals eat?) andtime questions (When do animals eat?).

Question Answering

Analyze the document containing the answer:

Highlight facts described by subject – verb – object triplets (identified in the Penn Treebank parse tree)

Obtain the document semantic graph

View the automatic document summary

Document Overview

Semantic GraphDocumentPlain text format

Named entity extractionCo-reference resolution

According to traditional Chinese medical belief, mental problems, laziness, malaria, epilepsy, toothache and lack of sexual appetite can be treated with tiger parts, leading to rampant poaching of the animal in Asia , the World Wide Fund ( WWF ) said.

AsiaWorld Wide Fund WWF

Asia - location

World Wide Fund - organization

WWF -organization

Co-reference

S – V – O triplet extraction

Triplet enhancement

Semantic Graph

Document Summary

Feature Extractor

Features:linguisticdocumentgraph

Linear SVM

Linear Model

The Kerinci conservation project, an area of around three million hectares (7. 4 million acres) in west Sumatra, was being supported by funds from the World Bank, Subijanto said. [10.0912]Subijanto, a spokesman for the Forestry Ministry, said Indonesia was commited to protecting the tigers, which live within Sumatra's four designated conservation areas. [9.4155]

Rank

ing

Document Summary

There are people wanting tiger products who didn't want them before, " Ron Lilley,coordinator for species conservation at the WWF in Jakarta, told Reuters.Subijanto, a spokesman for the Forestry Ministry, said Indonesia was commited to protecting the tigers, which live within Sumatra's four designated conservation areas.The Kerinci conservation project, an area of around three million hectares (7. 4 million acres) in west Sumatra, was being supported by funds from the World Bank, Subijanto said.

Enhanced question answering systemQuestion answering, where the answer is supported by documentsDocument browsing

FactsDocument semantic graphAutomatic document summary

Conclusions

Future workSystem extensions: triplet extraction, named entity recognitionExpand the search to look for answers in ontologiesRelax the requirement that the questions have a predefined formImprove the document overview functionality by integrating external resources

Conclusions

Thank you!

Questions are guaranteed in life, answers aren’t.

Extracted features:

Document Summary

Linguistic Attributes (13) Document Attributes (11)

Graph Attributes (9)

•Logical form tag•Treebank tag•Part of speech tag•Depth of linguistic node•8 semantic tags for named entities

•Sentence related: e.g. – location of sentence within doc•Triplet related: e.g.- frequency of triplet element in sentence, in doc, …

•Authority and Hub weight, Page Rank•Node degree•Size of weakly connected component•Size of max length chain•Frequency of verbs among edges

Document SummaryObject - Word

Subject - Word

Verb - WordLocation Of Sentence In

DocumentSimilarity With CentroidNumber Of Locations In

SentenceNumber Of Named Entities In Sentence

Authority Weight Object

Hub Weight SubjectSize Weakly Conn Comp

Object

Rank

(Inf

orm

ation

Gai

n)

top related