aquaint 18-month workshop 1 light semantic processing for qa language technologies institute,...
TRANSCRIPT
![Page 1: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/1.jpg)
1Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Language Technologies Institute, Carnegie Mellon
B. Van Durme, Y. Huang,A. Kupsc and E. Nyberg
Towards Light Semantic Processingfor Question Answering
![Page 2: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/2.jpg)
2Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Overview of This Talk
• Motivation• Components of the Approach
– Logical Form – Similarity Measure– Unification Strategy
• Incorporation into JAVELIN
• Future Work / Next Steps
![Page 3: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/3.jpg)
3Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Example of Extraction Error
• Question: “When was Wendy’s founded?”
• Passage candidate:– “The renowned Murano glassmaking industry, on an
island in the Venetian lagoon, has gone through several reincarnations since it was founded in 1291. Three exhibitions of 20th-century Murano glass are coming up in New York. By Wendy Moonan.”
• Statistical extractor: 20th-century
![Page 4: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/4.jpg)
4Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Basic IdeaQ: “xxx xxxx xxxx xxxx xxxxxxxxxx xx xxxxx?” P: “xxx xxxx xxxx xxxx xxxxx xx xxxxx.”
A(?,C) A(B,C)
? = B
extract extract
Unification on simple predicatesrepresenting basic argumentstructure will provide a moreaccurate way to match questionswith appropriate answer(s)
Two Challenges:* Where do predicates come from?* Flexibility in interpretation…
partial interpretation
![Page 5: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/5.jpg)
5Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Associating Tokens with Concepts
• Imprecise Reference, e.g.:“John W. was greeted by William Clinton” “Bill greeted Mr. Wright”
• Definite Description, e.g.“Mr. Bush” vs. “the president”
• Anaphoric Reference
UNIFY( {GREET(“William Clinton”,”John W.”)} , {GREET(“Bill”,”Mr. Wright”)} )
Interpretation of tokens must be:•Approximate, not exact•Context-sensitive
![Page 6: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/6.jpg)
6Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Language Processing Tools
• BBN IdentiFinder (BBN, 2000)• Link Grammar parser (Grinberg et al., 1995)• KANTOO parser (Nyberg & Mitamura, 2000)• Brill part-of-speech tagger (Brill, 1995)• WordNet (Fellbaum, 1998)• Lexical Conceptual Structure (LCS) Database
(Dorr 2001)
![Page 7: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/7.jpg)
7Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Representation
• Formula: a set of literals• Literal: a predicate, plus two terms• Extrinsic literal: a relation mapping a
label to a label– SUBJECT(x1,x2)
• Intrinsic literal: a relation mapping a label to a value– ROOT(x1,|Benjamin|)
• Value: EVENT, past, +, |Mary Smith|,…
![Page 8: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/8.jpg)
8Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Example
Q = Who killed Jefferson?ROOT(x1,?a0),ROOT(x2,|kill|),ROOT(x3,|Jefferson|),TYPE(x2,|event|),TYPE(x1,|person|),TYPE(x3,|person|),SUBJECT(x2,x1),OBJECT(x2,x3),ANS(?a0)
P = Benjamin murdered Jefferson.ROOT(y1,|Benjamin|),ROOT(y2,|murder|),ROOT(y3,|Jefferson|),TYPE(y2,|event|),TYPE(y1,|person|),TYPE(y3,|person|),SUBJECT(y2,y1),OBJECT(y2,y3)
![Page 9: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/9.jpg)
9Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Graphically
?a0
x1 x2
kill
x3
Jeffersonperson
Benjamin
y1y2
murder
y3
Jeffersonperson
eventperson
person
event
SUBJECT
SUBJECT
OBJECT
OBJECT
ROOT
ROOT
ROOT
ROOT
ROOT
ROOT
TYPE
TYPE
TYPETYPE
TYPE
TYPE
![Page 10: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/10.jpg)
10Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Similarity Functions• A zero-to-one function that returns a value
representing similarity between the formulae for question, passage
• Unification requires similarity measurement between literal values
• sim(“Who killed Jefferson?”, ”Benjamin murdered Jefferson.”) = 0.9
![Page 11: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/11.jpg)
11Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(formula0,formula1)
Given two formulae, we define the similarity to be the geometricmean of the similarity between the separate extrinsic literals.
![Page 12: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/12.jpg)
12Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(extrinsicLiteral0,extrinsicLiteral1)
To measure the similarity between two extrinsic literals,we take the square root of the product of the similaritybetween each of the two pairs of labels.
![Page 13: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/13.jpg)
13Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(label0,label1)
To measure the similarity of two labels, we find the maximumpossible value of taking the geometric mean of the similarity of each pairwise combination of intrinsic literals that are shared by the two labels.
![Page 14: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/14.jpg)
14Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(intrinsicLiteral0,intrinsicLiteral1)
The similarity between two intrinsic literals is measured by similarity of the paired words, times the weight of the first literal.
![Page 15: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/15.jpg)
15Light Semantic Processing for QA
AQUAINT 18-Month Workshop
sim(word0,word1)
• sim(|kill|,|murder|) = 0.8– via WordNet distance function
• sim(?a0,|Benjamin|) = 1.0– zero cost for variable binding
![Page 16: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/16.jpg)
16Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Example
![Page 17: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/17.jpg)
17Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Answer
• Find the maximum possible similarity score, return the term bound to ?a0
• ?a0/|Benjamin|• sim(Q,P) = 0.9• Answer = Benjamin, 0.9
![Page 18: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/18.jpg)
18Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Current Status, Future Work• First version implemented, testing now• Short Term: Test “NLP IX” against statistical
extraction module on factoid questions• Longer Term:
– Support simple reasoning about questions and passages
– Investigate approach in narrower domains• Question answering based on CNS data on terrorism
and weapons of mass destruction– Extend similarity metric at word level
• Word co-occurrence information• Distance metrics on ontologies other than WordNet
– Incorporate LCS Lexicon
![Page 19: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/19.jpg)
19Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Summary
• We believe complex question answering requires more than statistical extraction methods
• Knowledge bottleneck forces compromise in depth of language processing
• Robust unification based on heuristic measure of similarity offers short-term solution
![Page 20: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/20.jpg)
20Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Additional Resources
• Paper available:
B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg (2003). “Towards Light Semantic Processing for Question Answering”, presented at the HLT/NAACL 2003 Workshop on Text Meaning.
• This and other papers at the JAVELIN web site:
http://www.lti.cs.cmu.edu/Research/JAVELIN
![Page 21: AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg](https://reader030.vdocuments.site/reader030/viewer/2022032805/56649efa5503460f94c0ba19/html5/thumbnails/21.jpg)
21Light Semantic Processing for QA
AQUAINT 18-Month Workshop
Questions?