question answering from zero to hero elena eneva 11 oct 2001 advanced ir seminar
TRANSCRIPT
![Page 1: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/1.jpg)
Question AnsweringFrom Zero to Hero
Elena Eneva11 Oct 2001Advanced IR Seminar
![Page 2: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/2.jpg)
Sources
TREC-9. 2001. http://la.lti.cs.cmu.edu/JavelinE. Voorhees. "The Overview of the TREC-9 Question Answering track." J. Prager, E. Brown, A. Coden and D. Radev. "Question answering by predictive annotation." SIGIR '00. C.L.A. Clarke, G.V. Cormack and T.R. Lynam. "Exploiting redundancy in question answering." In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2001.
V
P
C
![Page 3: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/3.jpg)
Question Answering
IR Successful in large scale text search
problems Retrieve full documents
IE Successful in extracting very precise
answers from text Work on pre-specified domains
Combining the strengths
![Page 4: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/4.jpg)
QA track in TREC
Collection of unstructured documents (table 1 in V)Short factual questions in English (Why can't ostriches fly ? Where did Bill Gates go to college ?) also figure 1 in V
Return answer as a ranked list of 5 fragments of documents (2 categories: 50 and 250 bytes)
![Page 5: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/5.jpg)
Evaluation
By peopleReciprocal rank of first correct answer or 0% answers which were foundStrict and Lenient scores (supported and unsupported judgment)Short and Long version
![Page 6: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/6.jpg)
2 QA TREK systems
Question Answering by Predictive Annotation - Prager, Brown, Coden (IBM) and Radev (U of Michigan)Exploiting Redundancy in Question Answering - Clarke, Cormack, Lynam (U of Waterloo)Ranking - Table 2 in V
![Page 7: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/7.jpg)
Exploiting Redundancy in Question Answering
Question -> a query for submission to a passage
retrieval component-> a set of selection rules what guides the
process of extracting answers from the passages (answer category)
Get a list of k passagesIdentify possible answersRank the possible answers
Figure 1 in C
Question analysis – IR – IE
![Page 8: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/8.jpg)
3 features with greatest contribution
Flexibility of the parserPassage retrieval technique (high quality passages)Redundancy in the answer selection component – contribution of evidence from multiple passages to identify the most likely answer
![Page 9: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/9.jpg)
Passage Retrieval techniques
Each document D is an ordered sequence of terms D= d1 d2 d3 … dmExtent (u, v) (minimal)Query Q generated from the question Q={q1, q2, q3, …}Compute the score for an extent(u, v) for which TQ is a coverHigher scores to passages whose P of occurrence is lower
![Page 10: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/10.jpg)
RedundancyEach candidate term t is is assigned a weight that takes into account the number of distinct passages in which the term appears, as well as the relative frequency of the term in the databaseWt = Ct log (N/ft)Ct is the number of distinct passages in which t appearsSumming the weights of a all terms in a candidate answerDetermine the first one, reduce weights to 0, do all over until have 5 Figure 2 in C
![Page 11: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/11.jpg)
Exploiting redundancy
“Who” questions100 GB corpusK depth, W widthFigure 2 in C
![Page 12: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/12.jpg)
Who wants to be a Millionaire?
Real life example70% correct overallFigure 5 in C
![Page 13: Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar](https://reader035.vdocuments.site/reader035/viewer/2022081809/56649ea85503460f94babaf5/html5/thumbnails/13.jpg)
Question answering by predictive annotation
IBM systemShallow NLPSystem structure Figure 1 in PAnnotation Indexing