question answering what's next?mandl/neu/rijkehildesheim2005_bw.pdf · documents, spoken...
TRANSCRIPT
![Page 1: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/1.jpg)
Question Answering What's Next?
Maarten de RijkeInformatics Institute
University of Amsterdam
1
![Page 2: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/2.jpg)
Focused Retrieval•What is it?•Why?•How?
2
![Page 3: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/3.jpg)
3
![Page 4: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/4.jpg)
• Avalanche• Internet
• Intelligence
• Science (astronomy, biomedical, …)
• Desktop
• Growing at an growing pace• 1999: 250 MB pp for each person on earth
• 2002: 800 MB pp for each person on earth
Information
4
![Page 5: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/5.jpg)
• Pinpointing• Highly relevant, highly specific retrieval
• Retrieving not just documents• People, objects, places, services, answers, …
• Multiple languages, tasks, information needs, types of users
Challenges
5
![Page 6: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/6.jpg)
Facing the challenge
• Language understanding?
• Method• Mix theory, experiment, application
• Task-driven• Real users, real information needs, real data,
or at least “approximations”
6
![Page 7: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/7.jpg)
QA as pinpointing
• Given a document collection and a question, return the answer
• Attractive task• Opportunities for language processing in
information access?
• Opportunities for knowledge representation in information access?
• Combine IR, NLP, AI
• People ask questions!
7
![Page 8: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/8.jpg)
9%4%
4%5%3%
26%
48%
How WhatWhen WhereWhy WhoOther
Distribution ofquestions
from the log
1%
99%
QueriesQuestions
Search engine log fileSep 22, 2004–Jan 9, 2005
10M searches
People ask questions
8
![Page 9: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/9.jpg)
QA history
• Around since the 1960s• Natural language front-end for DBs
• LUNAR
• Focus on open-domain since 1990s• MURAX (Kupiec 1993)
• Hirschman: reading comprehension
• TREC QA: 1999–…
9
![Page 10: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/10.jpg)
TREC
• Text REtrieval Conference• Launched in 1992
• Large general full text files
• Information analyst needs
• Zillions of tasks (over the years)• Ad-hoc retrieval, routing/filtering, cross-language, scanned
documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, terabyte, enterprise
10
![Page 11: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/11.jpg)
QA at TREC
• Launched at TREC-8, 1999
• Recent years• Focus on factoids (”NE retrieval”)• Single exact answer with supporting doc• Variations on factoids (”lists”)
• Definitions (2003)• Return “nuggets”• “Other” questions (2004)
11
![Page 12: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/12.jpg)
Factoid examples
• Factoids• Who discovered Oxygen?
• When did Hawaii become a state?
• Where is Ayers Rock located?
• What team won the World Series in 1992?
• Lists • List the names of chewing gums
• What governments still officially recognize and support International Labor Day?
12
![Page 13: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/13.jpg)
Definitions/”Others”
• Definition• Who is Aaron Copland?
• What is a quasar?
• “Other” organized around a target• Target: Americorps
• Factoid: How many volunteers work for it?
• List: What activities are its volunteers involved in?
• Other: Tell me other interesting things about this target I didn’t know enough to ask directly?
13
![Page 14: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/14.jpg)
QA dimensions
• Nature of the information• Structured, semistructured, unstructured
• Nature of the questions• Factoids, definitionoids, procedures, opinions
• Nature of the answer• Extracted snippet, generated text
• Nature of the technique• Linguistically sophisticated, data-driven
14
![Page 15: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/15.jpg)
So?Let’s step back
15
![Page 16: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/16.jpg)
a. Questions: general purposeb. Data sources: open domainc. Answers: single, one shot, no follow-up
TREC QA ingredients
d. Users/user models/…
16
![Page 17: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/17.jpg)
Issues
• What’s the nature of the user’s need?
• What’s the appropriate response?
• What’s the appropriate data source?
17
![Page 18: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/18.jpg)
Assumptions …
• … about questions• Forget the information analyst (who comes
with a rich context, set of tools, scenario, task, …, none of which is modeled in the TREC setting)
• “Ad hoc” setting: any topic, any user, general purpose functionality
18
![Page 19: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/19.jpg)
What do people ask?• Back to the log files…
• Respectable amount of TREC factoids–not the majority
• how tall is christina aquilera• what does mig welding stand for• where does sean hannity live?• how long does ritalin stay in your bloodstream?• where does moss grow
19
![Page 20: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/20.jpg)
What do people ask?• Back to the log files…
• Respectable amount of TREC factoids–not the majority
• Lots of procedural questions
• how tall is christina aquilera• what does mig welding stand for• where does sean hannity live?• how long does ritalin stay in your bloodstream?• where does moss grow
• how to repair scratches in leather• how do you transfer money from one bank to another?• how to speed up xp• how to cook a sweet potatoe• how to report a fraudulant bankruptcy claim• jet engine how it works
20
![Page 21: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/21.jpg)
What do people ask?• Back to the log files…
• Respectable amount of TREC factoids–not the majority
• Lots of procedural questions
• Large amount of definitiods
• how tall is christina aquilera• what does mig welding stand for• where does sean hannity live?• how long does ritalin stay in your bloodstream?• where does moss grow
• how to repair scratches in leather• how do you transfer money from one bank to another?• how to speed up xp• how to cook a sweet potatoe• how to report a fraudulant bankruptcy claim• jet engine how it works
• what is pink noise• what is a catapult• what is a pictograph• define social justice• what is a rational number?
21
![Page 22: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/22.jpg)
What do people ask?• Back to the log files…
• Respectable amount of TREC factoids–not the majority
• Lots of procedural questions
• Large amount of definitiods
• Gems, many gems
• how tall is christina aquilera• what does mig welding stand for• where does sean hannity live?• how long does ritalin stay in your bloodstream?• where does moss grow
• how to repair scratches in leather• how do you transfer money from one bank to another?• how to speed up xp• how to cook a sweet potatoe• how to report a fraudulant bankruptcy claim• jet engine how it works
• what is pink noise• what is a catapult• what is a pictograph• define social justice• what is a rational number?
• how to understand women• how to stop your dog from pooping on your stairs• almost everyone sees me without noticing me, for what is beyond is what he or she seeks. what am i?• how smart are blonds
22
![Page 23: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/23.jpg)
Recommendation
• If you’re interested in being able to answer questions that Real People ask• Do factoids
• Do definitionoids
• Do procedural questions
23
![Page 24: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/24.jpg)
Assumptions …
• … about answers• Answers need not match to be helpful
• Users apply their own selection
• Answers are not end-points, but stages in a berry-picking process
24
![Page 25: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/25.jpg)
Example
• Where is the Rijksmuseum located?• In a building designed by Cuijpers
• Across Museumplein from the Concertgebouw
• You can take Tram 5 to get there
• Jan Luijkenstraat in Amsterdam
• The Rijksmuseum van Oudheden (RMO) is the national museum of antiquities at Leiden
• What is the right answer?
25
![Page 26: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/26.jpg)
Value of unsought info
• What’s the average rent for an apartment in Amsterdam?• Try this site: http://www.expatriates.com/
classifieds/amst/
• Who was Theo van Gogh?• Van Gogh made the movie ‘Submission’ in
collaboration with Ayaan Hirsi Ali, a Dutch politician and former Somali refugee
26
![Page 27: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/27.jpg)
Recommendations
• Return ranked lists of answers
• Answers in context so that users (assessors) can determine validity
• Continue with “other” questions
• Work on novelty and importance models
27
![Page 28: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/28.jpg)
Assumptions …
• … about documents• Whatever we define our user or user model to
be, the data sources should be “appropriate” for the intended questions
• Huh?
28
![Page 29: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/29.jpg)
What data source?
• To look up an angioprim?
• To find out how fast cheetahs runs
• To find out about UK economic growth?
• To determine what Mozart was born?
• To repair a cold air leak in Mazda?
29
![Page 30: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/30.jpg)
Data sources
• The web
• Or heterogeneous• dictionary for definitions
• manuals/FAQs for procedural
• encyclopedia/scientific papers/… for general knowledge questions
• newspapers for current events
30
![Page 31: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/31.jpg)
Recommendations
• Avoid document-question mismatch
• Do Web QA
• Or do QA against a heterogenous collection, where resource selection is a key part of the task
31
![Page 32: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/32.jpg)
Question AnsweringWhat’s Next?
32
![Page 33: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/33.jpg)
• TREC• Pretty much the same as in 2004
• CLEF• Debating scenario, doc collection, type of
questions, answer format, …
• INEX• …
QA evaluations in 2005
33
![Page 34: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/34.jpg)
INEX
• What is INEX?• INitiative for the Evaluation of XML retrieval
• Launched in 2002
• Two types of information needs• CO: content only
• CAS: content and structure• IEEE computer science journals
• Provide focused access by returning elements, not documents
• $1M question: Does structure help?
34
![Page 35: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/35.jpg)
QA at INEX
• Organized around a user model• Undergraduate student writing essays
• Questions• Factoids, definitions, procedural, …
• Documents• Semistructured, scientific articles,
encyclopedia, dictionaries• Answers high-lighted in context
35
![Page 36: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/36.jpg)
QA at UAmsWhat’s Next?
36
![Page 37: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/37.jpg)
• Participation• TREC, CLEF, INEX
• Coordination• INEX, Dutch QA for CLEF
• Research• Multi-stream architecture for QA
• …
QA at UAms
37
![Page 38: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/38.jpg)
• QA research at UAms• …
• NERC and IE for QA (emphasis on Dutch)
• QA against Wikipedia and similar resources
• FAQ mining and retrieval
• Developing novelty and importance models
• …
• Contemplating to open source a “dressed down” QA system
38
![Page 39: Question Answering What's Next?mandl/Neu/Rijkehildesheim2005_bw.pdf · documents, spoken documents, video, NLP, web, question answering, novelty, robust, genomics, HARD, ... collaboration](https://reader030.vdocuments.site/reader030/viewer/2022041100/5ed76a5f7ccd706a9e2f1264/html5/thumbnails/39.jpg)
• Lots• And it will be great• And you should be part of it
Question AnsweringWhat’s Next?
39