mo q a meaning oriented question answering

31
MOQA Meaning Oriented Question Answering An AQUAINT Project from ILI T

Upload: metea

Post on 15-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

MO Q A Meaning Oriented Question Answering. An AQUAINT Project from. ILIT. CRL is a research department in the School of Arts and Sciences at NMSU Director: Jim Cowie Currently has a staff of 10 PhDs Mainly focuses on language engineering research - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MO Q A Meaning Oriented Question Answering

MOQAMeaning Oriented Question Answering

An AQUAINT Project from

ILIT

Page 2: MO Q A Meaning Oriented Question Answering

• CRL is a research department in the School of Arts and Sciences at NMSU

• Director: Jim Cowie

• Currently has a staff of 10 PhDs

• Mainly focuses on language engineering research

• Languages include – Arabic, Farsi, Turkish, Spanish, Chinese, Japanese, Korean

Contact: Jim Cowie – [email protected]

Page 3: MO Q A Meaning Oriented Question Answering

• Advanced-technology company in Ithaca, New York• Founded in 1990 by Dr. Richard Kittredge, Dr. Tanya

Korelsky, and Dr. Owen Rambow. • Goal is to transform results from research in natural

language processing into practical software applications. • Has developed a core set of text generation tools• Current focus is on expanding the range of applications

for this technology, with a particular focus on the Web.

Contact: Tanya Korelsky – [email protected]

Page 4: MO Q A Meaning Oriented Question Answering

• The Institute for Language and Information Technologies at University of Maryland Baltimore County

• Sergei Nirenburg, Director• Opened September 2002 with a team of 3 senior personnel

– Sergei Nirenburg– Stephen Beale– Marge McShane

Contact: Sergei Nirenburg – [email protected]

ILIT

Page 5: MO Q A Meaning Oriented Question Answering

Meaning-Oriented Question-Answering with Ontological Semantics

• Domain: travel and meetings– question understanding and interpretation;

– determining the answer and

– presenting the answer

• two kinds of data source

– open text (in English, Arabic and Persian)

– Structured Fact Repository containing instances of ontological entities

Page 6: MO Q A Meaning Oriented Question Answering

Project Tasks• Design and Implementation of System Architecture

• Knowledge Acquisition

• Question Understanding

• Question Interpretation

• Answer Determination

• Answer Formulation

• Documentation; User and Evaluator Training; Testing; and System Evaluation

Page 7: MO Q A Meaning Oriented Question Answering

Dialog and Self-Awareness-related

Answer Determination:

(for running commentaryand workflow and context-

related communication)

Question Interpretation:

Ÿ task contextŸ dialog contextŸ user profileŸ analyst team profile

QuestionUnderstanding

Answer Formulationand Presentation

Input:User Question

in English

Output:

System Response

in English

Task-Oriented AnswerDetermination from Fact

Database:

IE from Fact Database

NL Query Generation:

in English, Arabic andone of Persian, Russian,

Spanish

Answer Determinationfrom open text:

IR IE Production of TMRs

for Textual Fillers ofIE Templates

NLQuery

FACT REPOSITORY:including

instances ofgoals, plans,

scripts

:ONTOLOGY

including goals,plans, scripts

LEXICONSEach Language

in System:including names

and phrases

Knowledge Sources

Processing Modules andIntermediate Results

Goal and PlanProcessing

Manager

System Working Memory

Extended TMR:adds a statement of activegoals, plans and scripts in

the system

SystemResponse

in TMR

Basic TextMeaning

Representation(TMR)

Goal Attainment andPlan Execution

Agenda

Using XML

Page 8: MO Q A Meaning Oriented Question Answering

Development Methodology

• Rapid Prototyping

• Using pre-existing components

• Grow the various development activities toward an integrated system

• Today – look at one example of each

User Interaction

User Interaction

AnalysisAnalysisResourcesResources XML - TMRXML - TMR

Page 9: MO Q A Meaning Oriented Question Answering

Deliverables• A QA system in the domain of travel and

meetings, with a capability to search for information in open texts in three languages and in a structured, ontology-based Fact DB;

• an enhanced text analysis system for each of the languages;

• a question interpretation module that takes into account user goals and the context of the dialog;

• an integrated IR/IE module working on open text in three languages, on the basis of ontologically defined extraction templates;

Page 10: MO Q A Meaning Oriented Question Answering

Deliverables (Cont.)

• an ontology of about 6,500 concepts;• A Fact DB of about 100,000 facts;• a system for automating the acquisition of the Fact

DB;• a semantic lexicon for each of the languages in the

system, at about 20,000 entries• a decision-making module that determines the

answer(s) and system action(s) at each step of the dialog/task processing;

• an intuitive and intelligent multi-modal user interface, which uses natural language generation in answers and for query validation

Page 11: MO Q A Meaning Oriented Question Answering

Project Status

• Approval to spend given from August 21st 2002

• UMBC – Ontology and English Lexicon Improvement, Development of Scripts, Meaning Based Text Analysis

• CoGenTex – Interface design, Human Factors, Text Generation

• NMSU – Text-preprocessing, Arabic and Farsi resources and analysis, data collection, system integration.

Page 12: MO Q A Meaning Oriented Question Answering

Corpora Being Collected

• Arabic :– http://www.aljazirah.net– http://news.bbc.co.uk/hi/arabic/news/– http://www.irna.com/ar/index.shtml

• English:– http://www.cnn.com– http://www.bbc.co.uk/worldservice/index.shtml– http://www.irna.com/en/

• Persian:– http://www.hamshahri.net/– http://www.bbc.co.uk/persian/index.shtml– http://www.irna.com/pe/index.shtml

Page 13: MO Q A Meaning Oriented Question Answering

Ontology

• An ontology is a formally and semantically defined repository of concepts and relations about the world.– Including knowledge about events, objects, and

work flow scripts

• Linked to the ontology are:– fact repositories, including facts about actual

events, objects, places, personalities, etc.– lexica, defining words in a language in

ontological terms– “onomastica”, or multilingual proper name lists

Page 14: MO Q A Meaning Oriented Question Answering

Structured Common Fact Repository

• Uniform organization for all kinds of data

• Support for multiple applications and tools

• Semantically anchored in general ontology

• Constantly updated; today, manually; tomorrow, semi-automatically; long-term, automatically

• Supports both domain knowledge and workflow specification

Page 15: MO Q A Meaning Oriented Question Answering

Populating the Fact RepositoryOriginal Text

CIA Report 02 [15 October, 2002, from a source, said to be credible, in Jordan]:

A man named Majed H., using a faked Jordanian passport and travel visa, traveled from Aman, Jordan to Chicago, Illinois on 12 July, 2002.

Majed H. is now known to have resided in Afghanistan for two years (1996-1997) and has been identified as a member of Al-Qaeda.

Gloss

An unnamed source, who is very reliable, informed the CIA sometime between July 12, 2002 and October 15, 2002, of a travel-event by Majed H. on July 12, 2002, from Aman (Amman) Jordan to Chicago Illinois.

In order to take the trip, Majed H. used a fake passport and visa issued by Jordan.

Majed H. was located in Afghanistan from January 1, 1996, through December 31, 1997, and is a member of Al-Qaeda.

Page 16: MO Q A Meaning Oriented Question Answering

Populating the Fact Repository (2)Gloss

An unnamed source, who is very reliable, informed the CIA sometime between July 12, 2002 and October 15, 2002, of a travel-event by Majed H. on July 12, 2002, from Aman (Amman) Jordan to Chicago Illinois.

In order to take the trip, Majed H. used a fake passport and visa issued by Jordan.

Majed H. was located in Afghanistan from January 1, 1996, through December 31, 1997, and is a member of Al-Qaeda.

Facts (12 Total)

INFORM-1 AGENT: SOCIAL-ROLE-4 BENEFICIARY: ORGANIZATION-1 THEME: TRAVEL-EVENT-0 TIME: <> 07/12/2002 10/15/2002 MODALITY-EPISTEMIC: > 0.6

……………

NATION-2 HAS-NAME: "Jordan"

CITY-1 HAS-NAME: "Amman" IN-NATION: NATION-2

Page 17: MO Q A Meaning Oriented Question Answering

LEXICON: English lexical entry mapped to concept “EXIT”

Page 18: MO Q A Meaning Oriented Question Answering

LEXICON: Chinese lexical entry mapped to concept “EXIT”

Page 19: MO Q A Meaning Oriented Question Answering

Using Ontology to Support Retrieval

• Documents need to be retrieved using the language of the document

• The representation of queries in the system is in terms of ontological concepts and “facts”

• We will use the ontology to support retrieval in all three languages

• Current experiment uses Chinese and Spanish- Ontology-Language lexicons exist for these languages

Page 20: MO Q A Meaning Oriented Question Answering

User Specified Query

Page 21: MO Q A Meaning Oriented Question Answering

Mapped to Associated Concepts

Page 22: MO Q A Meaning Oriented Question Answering

Concepts Map to Language of Documents

Page 23: MO Q A Meaning Oriented Question Answering

Concept-Word Mappings

Page 24: MO Q A Meaning Oriented Question Answering

Further Mapping of Concepts

Page 25: MO Q A Meaning Oriented Question Answering

Generation Tasks

Months 1-6• Subtask 1: First prototype of intelligent question

answering user interface, involving hypertext generation (December demo)

• Subtask 2: Gathering of end users feedback on the interface functionality, look-and-feel and user customization

• Subtask 3: Design of extensions to cover broader collection of concepts from the ontology

• Milestone at next 6 months: Report on user feedback

Page 26: MO Q A Meaning Oriented Question Answering

MOQA User Interface

• Support for both natural languages queries and structured queries

• Intuitive web-based multi-modal interface for answers

• Tables, text, maps, time line, and social network graphs are interconnected by hyperlinks

• Natural language generation used in answers and for query validation

• Implemented using XML-based technology

• Positive reviews at the kick-off from an HCI expert and program management

Page 27: MO Q A Meaning Oriented Question Answering

MOQA User Interface – Query Page

• Support for NL-based queries and structured queries

• Structured query validation with automatically generated NL paraphrase

• WYSIWYM editing of structured queries

Page 28: MO Q A Meaning Oriented Question Answering

MOQA User Interface –Underlying XML Queries• Uses standard XML

technology (e.g. XML-compliant browser, XML parsers, etc.)

• Supports modularity – the XML representation is viewable and exchangeable between subsystems

• Assures automatic validation of query instances using a query class hierarchy described by XML schemas

• Uses logical expressions in XML to support complex queries

Page 29: MO Q A Meaning Oriented Question Answering

MOQA User Interface – Results Page (kick-off concept demo version)

• Concept demo helped to perform requirements analysis

• Demonstrated integrated display using tables, textual summary, map and time line

• Illustrated filtering table data by using hyperlinks in text

Page 30: MO Q A Meaning Oriented Question Answering

MOQA User Interface - Details Page (concept demo)

• Demonstrated display of additional types of information including social networks and source document extracts

• Illustrated “drill-down” by hyperlinks and typical follow-up queries based on underlying ontology

Page 31: MO Q A Meaning Oriented Question Answering

MOQA User Interface – Research Plans

• Presentation of partially understood natural language queries

• Personalization of answer presentation both content-wise (based on user expertise) and form-wise (based on user presentation preferences)

• Intelligent maintenance of session history based on typical work flow and collaboration patterns within groups

• Interface portability between subject domains• Incremental evolution based on validation by domain

experts