automatic set expansion for list question answering richard c. wang, nico schlaefer, william w....
Post on 21-Dec-2015
217 views
TRANSCRIPT
Automatic Set Expansion for List Question AnsweringRichard C. Wang, Nico Schlaefer, William W. Cohen, and Eric Nyberg
Language Technologies InstituteCarnegie Mellon UniversityPittsburgh, PA 15213 USA
2 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Task Automatically improve answers generated by Que
stion Answering systems for list questions, by using a Set Expansion system.
For example: Name cities that have Starbucks.
QA Answers Expanded AnswersBostonSeattle
Carnegie-MellonAquafinaGoogle
Logitech
SeattleBoston
ChicagoPittsburgh
Carnegie-MellonGoogle
Better!
3 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
4 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Question Answering (QA) Question Answering task:
Retrieve answers to natural language questions Different question types:
Factoid questions List questions Definitional questions Opinion questions
Major QA evaluations: Text REtrieval Conference (TREC): English NTCIR: Japanese, Chinese CLEF: European languages
5 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Typical QA Pipe
line
QuestionAnalysis
Query Generation& Search
CandidateGeneration
AnswerScoring
KnowledgeSources
Question String
Analyzed Question
Search Results
Candidate Answers
Scored Answers
The two original textsmileys were inventedon September 19, 1982by Scott E. Fahlman ...
• smileys• September 19, 1982• Scott E. Fahlman
Candidate Score
Scott E. Fahlman 0.853smileys 0.418September 19, 1982 0.239
“Who invented the smiley?”
Answer type: PersonKeywords: invented, smiley...
6 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
QA System: Ephyra (Schlaefer et al., TREC 200
7) History:
Developed at University of Karlsruhe, Germany and Carnegie Mellon University, USA
TREC participations in 2006 (13th out of 27 teams) and 2007 (7th out of 21 teams)
Released into open source in 2008
Different candidate generators: Answer type classification Regular expression matching Semantic parsing
Available for download at: http://www.ephyra.info/
7 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
8 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Set Expansion (SE) For example,
Given a query: {“survivor”, “amazing race”} Answer is: {“american idol”, “big brother”, ....}
More formally, Given a small number of seeds: x1, x2, …, xk wh
ere each xi St Answer is a listing of other probable elements: e1, e2, …, en where each ei St
A well-known example of a web-based set expansion system is Google Sets™ http://labs.google.com/sets
9 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
SE System: SEAL (Wang & Cohen, ICDM 2007)
Features Independent of human/markup language
Support seeds in English, Chinese, Japanese, Korean, ... Accept documents in HTML, XML, SGML, TeX, WikiML, …
Does not require pre-annotated training data Utilize readily-available corpus: World Wide Web
Based on two research contributions Automatically construct wrappers for extracting candi
date items Rank extracted items using random graph walk
Try it out for yourself: http://rcwang.com/seal
10 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
SEAL’s SE Pipeline
Fetcher: downloads web pages from the Web Extractor: learns wrappers from web pages Ranker: ranks entities extracted by wrappers
CanonNikonOlympus
PentaxSonyKodakMinoltaPanasonicCasioLeicaFujiSamsung…
11 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Challenge SE systems require relevant (non-noisy) s
eeds, but answers produced by QA systems are often noisy.
How can we integrate those two systems together?We propose three extensions to SEAL
Aggressive Fetcher Lenient Extractor Hinted Expander
12 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
13 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Original Fetcher
Procedure:1. Compose a search query by concatenating all seeds
2. Use Google to request top 100 web pages
3. Fetch web pages and send to the Extractor
Seeds
BostonSeattle
Carnegie-Mellon
Query
Boston Seattle Carnegie-Mellon
14 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Proposed Fetcher Aggressive Fetcher (AF)
Sends a two-seed query for every possible pair of seeds to the search engines
More likely to compose queries containing only relevant seeds
Seeds
BostonSeattle
Carnegie-Mellon
Queries
Boston SeattleBoston Carnegie-MellonSeattle Carnegie-Mellon
15 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
16 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Original Extractor A wrapper is a pair of L and R context string
Maximally-long contextual strings that bracket at least one instance of every seed
Extracts strings between L and R
Learn wrappers from web pages and seeds on the fly Utilize semi-structured documents Wrappers defined at character level
No tokenization required (language-independent) However, very page specific (page-dependent)
17 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
<img src="/common/logos/honda/logo-horiz-rgb-lg-dkbg.gif" alt="4"></a> <ul><li><a href="http://www.curryhonda-ga.com/"> <span class="dName">Curry Honda Atlanta</span>...</li> <li><a href="http://www.curryhondamass.com/"> <span class="dName">Curry Honda</span>...</li> <li class="last"><a href="http://www.curryhondany.com/"> <span class="dName">Curry Honda Yorktown</span>...</li></ul> </li>
<li class="honda"><a href="http://www.curryauto.com/">
<li class="acura"><a href="http://www.curryauto.com/">
<li class="toyota"><a href="http://www.curryauto.com/">
<li class="nissan"><a href="http://www.curryauto.com/">
<li class="ford"><a href="http://www.curryauto.com/"> <img src="/common/logos/ford/logo-horiz-rgb-lg-dkbg.gif" alt="3"></a> <ul><li class="last"><a href="http://www.curryauto.com/"> <span class="dName">Curry Ford</span>...</li></ul> </li>
<img src="/curryautogroup/images/logo-horiz-rgb-lg-dkbg.gif" alt="5"></a> <ul><li class="last"><a href="http://www.curryacura.com/"> <span class="dName">Curry Acura</span>...</li></ul> </li>
<img src="/common/logos/toyota/logo-horiz-rgb-lg-dkbg.gif" alt="7"></a> <ul><li class="last"><a href="http://www.geisauto.com/toyota/"> <span class="dName">Curry Toyota</span>...</li></ul> </li>
<img src="/common/logos/nissan/logo-horiz-rgb-lg-dkbg.gif" alt="6"></a> <ul><li class="last"><a href="http://www.geisauto.com/"> <span class="dName">Curry Nissan</span>...</li></ul> </li>
18 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Proposed Extractor Lenient Extractor (LE)
Maximally-long contextual strings that bracket at least one instance of a minimum of two seeds
More likely to find useful contexts that bracket only relevant seeds
Text
... in Boston City Hall ...
... in Seattle City Hall ...
... at Boston University ...
... at Seattle University ...
... at Carnegie-Mellon University ...
Learned Wrapper (w/o LE)
at <blah> University
Learned Wrappers (w/ LE)
at <blah> University
in <blah> City Hall
19 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
20 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Hinted Expander (HE)
Utilizes contexts in the question to constrain SEAL’s search space on the Web Extract up to three keywords from the question using
Ephyra’s keyword extractor Append the keywords to the search query
Example: Name cities that have Starbucks.
More likely to find documents containing desired set of answers
21 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
22 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Experiment #1: Ephyra Evaluate on TREC 13, 14, and 15 datasets
55, 93, and 89 list questions respectively
Use SEAL to expand top four answers from Ephyra Outputs a list of answers ranked by confidence scores
For each dataset, we report: Mean Average Precision (MAP)
Mean of average precision for each ranked list
Average F1 with Optimal Per-Question Threshold For each question, cut off the list at a threshold which maximizes
the F1 score for that particular question
23 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Experiment #1: EphyraMean Average Precision
6%
10%
14%
18%
22%
26%
30%
34%
Trec 13 Trec 14 Trec 15
TREC Dataset
Mea
n A
vg. P
reci
sio
n (%
)
Ephyra
Ephyra's Top 4
SEAL
SEAL+LE
SEAL+LE+AF
SEAL+LE+AF+HE
F1 with Optimal Per-Question Threshold
12%
16%
20%
24%
28%
32%
36%
40%
Trec 13 Trec 14 Trec 15
TREC Dataset
Av
g. O
pti
ma
l F1
(%
)
Ephyra
Ephyra's Top 4
SEAL
SEAL+LE
SEAL+LE+AF
SEAL+LE+AF+HE
24 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Experiment #2: Ephyra
In practice, thresholds are unknown For each dataset, do 5-fold cross validation:
Train: Find one optimal threshold for four folds Test: Use the threshold to evaluate the fifth fold
Introduce a fourth dataset: All Union of TREC 13, 14, and 15
Introduce another system: Hybrid Intersection of original answers from Ephyra and expand
ed answers from SEAL
25 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Experiment #2: EphyraF1 with Trained Threshold
12%
14%
16%
18%
20%
22%
24%
26%
28%
30%
32%
Trec 13 Trec 14 Trec 15 All
TREC Dataset
Av
g. F
1 (
%)
Ephyra
SEAL+LE+AF+HE
Hybrid
26 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Outline Introduction
Question Answering Set Expansion
Proposed Approach Aggressive Fetcher Lenient Extractor Hinted Expander
Experimental Results QA System: Ephyra Other QA Systems
Conclusion
27 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Experiment: Other QA Systems Top five QA systems that perform the best on li
st questions in TREC 15 evaluation1. Language Computer Corporation (lccPA06)
2. The Chinese University of Hong Kong (cuhkqaepisto)
3. National University of Singapore (NUSCHUAQA1)
4. Fudan University (FDUQAT15A)
5. National Security Agency (QACTIS06C)
For each QA system, train thresholds for SEAL and Hybrid on the union of TREC 13 and 14 Expand top four answers from the QA systems on T
REC 15, and apply the trained threshold
28 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Experiment: Top QA Systems
30%
32%
34%
36%
38%
40%
42%
44%
46%
lccPA06
Av
era
ge
F1
(%
)
F1 with Trained Threshold
12%
13%
14%
15%
16%
17%
18%
19%
20%
21%
22%
cuhkqaepisto NUSCHUAQA1 FDUQAT15A QACTIS06C
TREC Dataset
Baseline
Top 4 Ans.
Google Sets
SEAL+LE+AF+HE
Hybrid
29 / 30Language Technologies Institute, Carnegie Mellon University
Set Expansion for List Question AnsweringRichard C. Wang
Conclusion
A feasible method for integrating a SE approach into any QA system
Proposed SE approach is effective Improves QA systems on list questions by usi
ng only a few top answers as seeds Proposed hybrid system is effective
Improves Ephyra and (most) top five QA systems