uva-dare (digital academic repository) from document retrieval … · r.. baeza-yates and b....

18
UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) UvA-DARE (Digital Academic Repository) From document retrieval to question answering Monz, C. Publication date 2003 Link to publication Citation for published version (APA): Monz, C. (2003). From document retrieval to question answering. Institute for Logic, Language and Computation. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Download date:10 Jul 2021

Upload: others

Post on 19-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)

    UvA-DARE (Digital Academic Repository)

    From document retrieval to question answering

    Monz, C.

    Publication date2003

    Link to publication

    Citation for published version (APA):Monz, C. (2003). From document retrieval to question answering. Institute for Logic,Language and Computation.

    General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s)and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an opencontent license (like Creative Commons).

    Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, pleaselet the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the materialinaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letterto: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. Youwill be contacted as soon as possible.

    Download date:10 Jul 2021

    https://dare.uva.nl/personal/pure/en/publications/from-document-retrieval-to-question-answering(1581f2db-87bf-4496-aa14-201a48aff256).html

  • E.. Agichtein, S. Lawrence, and L. Gravano. Learning search engine specific query transformationss for question answering. In Proceedings of the 10th International WorldWorld Wide Web Conference (WWW10), pages 169-178,2001. 121

    E.. Agichtein, S. Lawrence, and L. Gravano. Learning to find answers to questions onn the web. ACM Transactions on Internet Technology, to appear. 121

    AltaVista,, ht tp: / /www.al tavista.com/. 1

    I.. Androutsopoulos, G. Ritchie, and P. Thanisch. Natural language interfaces to databases—Ann introduction. Natural Language Engineering, 1(1):29-81,1995. 2

    Askk Jeeves, http://www.askjeeves.com/. 12

    G.. Attardi, A. Cisternino, F. Formica, M. Simi, and A. Tommasi. PiQASso: Pisa questionn answering system. In E. Voorhees and D. Harman, editors, Proceedings ofof the Tenth Text REtrieval Conference (TREC 2001), pages 633-641. NIST Special Publicationn 500-250,2001. 33

    G.. Attardi, A. Cisternino, F. Formica, M. Simi, and A. Tommasi. Web suggestions andd robust validation for QA. In Notebook of the 11th Text REtrieval Conference (TREC(TREC 2002), pages 714-723. NIST Publication, 2002. 32

    R.. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.. 36

    N.. Belnap and T. Steel. The Logic of Questions and Answers. Yale University Press, 1976.. 20

    D.. Bikel, R. Schwartz, and R. Weischedel. An algorithm that learns what's in a name. MachineMachine Learning, 1(3):211-231,1999. 31

    F.. Black. A Deductive Question-Answering System. PhD thesis, Division of Engineer-ingg and Applied Physics, Harvard University, 1964. 36

    E.. Breck, J. Burger, L. Ferro, L. Hirschman, D. House, M. Light, and I. Mani. How to evaluatee your question answering system every day and still get real work done. Inn Proceedings of the 2nd Conference on Language Resources and Evaluation (LREC-2000),2000), 2000. 84

    http://www.altavista.com/http://www.askjeeves.com/

  • Bibliography y

    L.. Breimann. Bagging predictors. Machine Learning, 24(2):123-140,1996. 117

    L.. Breimann, J. Friedman, R. Ohlsen, and C. Stone. Classification and Regression Trees. Wadss worth and Brooks, 1984. 109,113

    E.. Brill , S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system.. In Proceedings o/Emperical Methods in Natural Language Processing (EMNLP 2002),2002), pages 257-264, 2002. 90, 91

    E.. Brill , J. Lin, M. Banko, S. Dumais, and A. Ng. Data-intensive question answer-ing.. In E. Voorhees and D. Harman, editors, Proceedings of the 10th Text REtrieval ConferenceConference (TREC 2001), pages 393-400. NIST Special Publication 500-250, 2001. 32 2

    W.. Bronnenberg, H. Bunt, J. Landsbergen, R. Scha, W. Schoenmakers, and E. van Utteren.. The question answering system PHL IQA I . In L. Bole, editor, Natural LanguageLanguage Question Answering Systems, pages 217-305. MacMillan, 1980. 2, 29

    S.. Buchholz and W. Daelemans. Complex answers: a case study using a WWW questionn answering sustem. Natural Language Engineering, 7(4):301-323, 2001. 32

    C.. Buckley, G. Salton, J. Allan, and A. Singhal Automatic query expansion using SMART:: TREC 3. In D. Harman, editor, Proceedings of the Third Text REtrieval ConferenceConference (TREC-3), pages 69-80. NIST Special Publication 500-215,1994. 90

    C.. Buckley, A. Singhal, and M. Mitra. New retrieval approaches using SMART: TRECC 4. In D. Harman, editor, Proceedings of the Fourth Text REtrieval Conference (TREC-4),(TREC-4), pages 25-48. NIST Special Publication 500-236,1995. 57, 58, 68, 71

    C.. Buckley and E. Voorhees. Evaluating evaluation measure stability. In N. Belkin, P.. Ingwersen, and M. Leong, editors, Proceedings of the 23rd Annual International ACMACM SIGIR Conference on Research and Development in Information Retrieval, pages 33-40,2000.. 48,49

    C.. Buckley and J. Walz. SMART in TREC 8. In E. Voorhees and D. Harman, editors, ProceedingsProceedings of the Eighth Text REtrieval Conference (TREC-8), pages 577-582. NIST Speciall Publication 500-246,1999. 47

    J.. Burger, F. Ferro, W. Greiff, J. Henderson, S. Mardis, A. Morgan, and M. Light. MITRE'ss Qanda at TREC-11. In E. Voorhees and L. Buckland, editors, Proceedings ofof the Eleventh Text REtrieval Conference (TREC 2002), pages 457-466. NIST Special Publicationn 500-251, 2002. 43

    R.. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg. Questionn answering from frequently-asked question files: Experiences with the FAQQ Finder system. AI Magazine, 18(2):57-66,1997. 12

  • Bibliography y

    J.. Callan. Passage-retrieval evidence in document retrieval. In B. Croft and C. van Rijsbergen,, editors, Proceedings of the 17th Annual International ACM SIG1R Confer-enceence on Research Research and Development in Information Retrieval, pages 302-310,1994. 51, 63 3

    H.. Chen, G. Shankaranarayanan, L. She, and A. Iyer. A machine learning approach too inductive query by examples: An experiment using relevance feedback, ID3, geneticc algorithms, and simulated annealing. Journal of the American Society for InformationInformation Science, 49(8):693-705,1998. 91

    J.. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-sourcee approach to question answering. In Notebook of the 11th Text REtrieval ConferenceConference (TREC 2002), pages 124-133. NIST Publication, 2002. 50

    C.. Clarke and G. Cormack. Shortest substring ranking and retrieval. ACM Transac-tionstions on Information Systems, 18(l):44-78, 2000. 70

    C.. Clarke, G. Cormack, G. Kemkes, M. Laszlo, T. Lynam, E. Terra, and P. Tilke. Statisticall selection of exact answers (MultiText experiments for TREC 2002). In NotebookNotebook of the 11th Text REtrieval Conference (TREC 2002), pages 162-170. NIST Publication,, 2002a. 10, 50, 70

    C.. Clarke, G. Cormack, D. Kisman, and T. Lynam. Question answering by passage selectionn (MultiText experiments for TREC-9). In E. Voorhees and D. Harman, editors,, Proceedings of the Ninth Text REtrieval Conference (TREC-9), pages 673-683. NISTT Special Publication 500-249, 2000a. 50, 70

    C.. Clarke, G. Cormack, M. Laszlo, T. Lynam, and E. Terra. The impact of corpus size onn question answering performance. In Proceedings of the 25th Annual International ACMACM SIGIR Conference on Research and Development in Information Retrieval, pages 369-370,, 2002b. 10

    C.. Clarke, G. Cormack, and E. Tudhope. Relevance ranking for one to three term queries.. Information Processing and Management, 36(2):291-311, 2000b. 69

    C.. Clarke and E. Terra. Passage retrieval vs. document retrieval for factoid question answering.. In Proceedings of the 26th Annual International ACM SIGIR Conference onon Research and Development in Information Retrieval, pages 427-428, 2003. 45,149

    P.. Cohen. Empirical Methods for Artificial Intelligence. MIT Press, 1995. 55

    A.. Colmerauer. Metamorphosis grammars. In Natural Language Communication with Computers,Computers, pages 33-189. Springer, 1978. 37

    W.. Conover. Practical Nonparametric Statistics. John Wiley and Sons, 2nd edition, 1980.. 54

  • Bibliography y

    W.. Cooper. Some inconsistencies and misidentified modeling assumptions in prob-abilisticc information retrieval. ACM Transactions on Information Systems, 13(1): 100-111,1995.. 94

    W.. Cooper, A. Chen, and F. Gey. Full text retrieval based on probalistic equations withh coefficients fitted by logistic regression. In D. Harman, editor, Proceedings of thethe 2nd Text REtrieval Conference (TREC-2), pages 57-66. NIST Special Publication 500-215,1993.. 91

    G.. Cormack, C. Clarke, C. Palmer, and D. Kisman. Fast automatic passage ranking. Inn E. Voorhees and D. Harman, editors, Proceedings of the Eighth Text REtrieval ConferenceConference (TREC-8), pages 735-742. NIST Special Publication 500-246,1999. 70

    N.. Craswell and D. Hawking. Overview of the TREC 2002 web track. In Notebook of thethe Uth Text REtrieval Conference (TREC 2002), pages 248-257. NIST Publication, 2002.. 87,152

    A.. Davison and D. Hinkley. Bootstrap Methods and Their Application. Cambridge Universityy Press, 1997. 55

    A.A. Davison and D. Kuonen. An introduction to the bootstrap with applications in R.. Statistical Computing and Graphics Newsletter, 13(1):6—11, 2002. 55

    O.. de Kretser and A. Moffat. Effective document presentation with a locality-based similarityy heuristic. In Proceedings of the 22nd Annual International ACM SIGIR Con-ferenceference on Research and Development in Information Retrieval, pages 113-120,1999a. 69,70 0

    O.. de Kretser and A. Moffat. Locality-based information retrieval. In Proceedings of thethe 10th Australasian Database Conference, pages 177-188,1999b. 69, 70

    T.. Dietterich. Machine learning research: Four current directions. AI Magazine, 18 (4):97-136,1997.. 117

    P.. Domingos and M. Pazzani. On the optimality of the simple Bayesian classifier underr zero-one loss. Machine Learning, 29(2-3):103-130,1997. 109

    R.. Duda and P. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, 1973.. 107

    S.. Dumais, M. Banko, E. Brill , J. Lin, and A. Ng. Web question answering: is more al-wayss better? In Proceedings of the 25th Annual International ACM SIGIR Conference onon Research and Development in Information Retrieval, pages 291-298, 2002. 10

    B.. Efron. Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7(1): 1-26,1979.. 54

    B.. Efron and R. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, 1993. 54 4

  • Bibliography y

    F.. Elbe, Y. Wang, S. Inglis, G. Holmes, and I. Witten. Using model trees for classifi-cation.. Machine Learning, 32(l):63-76,1998. 109

    Encarta.. Encarta encyclopedia, ht tp: / /encarta.rnsn.com/. 12

    Excite.. Excite search engine, http://www.excite.com/. 12

    R.. Fano. Transmissions of Information: A Statistical Theory of Communications. MIT Press,, 1961. 33

    U.. Fayyad and K. Irani. Multi-interval discretization of continuous-valued at-tributess for classification learning. In Proceedings of the Internation joint Conference onon Artificial Intelligence (IJCA1-93), pages 1022-1029,1993. 108

    O.. Ferret, B. Grau, M. Hurault-Plantet, G. Illouz, C. Jacquemin, N. Masson, andd P. Lecuyer. QALC—The question-answering system of LIMSI-CNRS. In E.. Voorhees and D. Harman, editors, Proceedings of the Ninth Text REtrieval Con-ferenceference (TREC-9), pages 235-244. NIST Special Publication 500-249, 2000. 122

    C.. Fillmore. The Case for Case. Holt, Rinehart and Winston, 1968. 34, 42

    M.. Fleischman, E. Hovy, and A. Echihabi. Offline strategies for online question answering:: Answering questions before they are asked. In Proceedings of the 41st AnnualAnnual Meeting of the Association for Computational Linguistics (ACL-2003), pages 1-7,2003.. 139

    E.. Frank and M. Hall. A simple approach to ordinal classification. In L. De Raedt andd P. Flach, editors, Proceedings of the 12th European Conference on Machine Learn-inging (EMCL 2001), Lecture Notes in Artificial Intelligence 2167, pages 145-156. Springer,, 2001. 108

    E.. Frank, L. Trigg, G. Holmes, and I. Witten. Naive bayes for regression. Machine Learning,Learning, 41(1)5-25,2000. 109

    Y.. Freund and R. Schapire. A decision-theoretic generalization of on-line learning andd an application to boosting. In Proceedings of the 2nd European Conference on ComputationalComputational Learning Theory, pages 23-37. Springer, 1995. 117

    Y.. Freund and R. Schapire. Experiments with a new boosting algorithm. In L. Saitta, editor,, Proceedings of the 13th International Conference on Machine Learning, pages 148-156.. Morgan Kaufmann, 1996. 117

    J.. Ginzburg. Interrogatives: Questions, facts and dialogue. In S. Lappin, editor, HandbookHandbook of Contemporary Semantic Theory, pages 385-422. Blackwell, 1995. 18

    Google,, http://www.google.com/. 1

    http://encarta.rnsn.com/http://www.excite.com/http://www.google.com/

  • Bibliography y

    A.. Graesser and T. Murachver. Symbolic procedures of question answering. In A.. Graesser and J. Black, editors, The Psychology of Questions, pages 15-88. Erl-baum,, 1985. 22, 23, 24, 26, 41

    N.. Graesser, A. Person and J. Huber. Mechanisms that generate questions. In T.. Lauer, E. Peacock, and A. Graesser, editors, Questions and Information Systems, pagess 167-187. Lawrence Erlbaum Associates, 1992. 26

    B.. Green, A. Wolf, C. Chomksy, and K. Laughery. Baseball: An automatic question answerer.. In E. Figenbaum and J. Fledman, editors, Computers and Thought, pages 207-216.. McGraw-Hill, 1963. 2, 27

    M.. Greenwood, I. Roberts, and R. Gaizauskas. The University of Sheffield TREC 20022 Q&A system. In Notebook of the J lth Text REtrieval Conference (TREC 2002), pagess 724-733. NIST Publication, 2002. 50

    J.. Groenendijk and M. Stokhof. Questions. In J. van Benthem and A. ter Meulen, ed-itors,, Handbook of Logic and Language, chapter 19, pages 1055-1124. Elsevier/MIT Pess,, 1997. 18

    Grolier.. The Academic American Encyclopedia. Grolier Electronic Publishing, 1990. 35

    C.. Hamblin. Questions. Australasian Journal of Philosophy, 36:159-168,1958. 18

    S.. Harabagiu and D. Moldovan. TextNet—a text-based intelligent system. Natural LanguageLanguage Engineering, 3(2): 171-190,1997. 27

    S.. Harabagiu, D. Moldovan, M. Pa§ca, R. Mihalcea, M. Surdeanu, R. Bunescu, R.. Girju, R. V., and P. Morarescu. The role of lexico-semantic feedback in open-domainn textual question-answering. In Proceedings of the 39th Annual Meeting of thethe Association for Computational Linguistics (ACL-2007), pages 274-281, 2001. 9, 10,, 33, 36

    D.. Harman. Ranking algorithms. In R. Baeza-Yates and W. Frakes, editors, Infor-mationmation Retrieval: Data Structures & Algorithms, chapter 14, pages 363-392. Prentice Hall,, 1992. 75

    D.. Harrah. The logic of questions. In D. Gabbay and F. Guenthner, editors, Hand-bookbook of Philosophical Logic, volume II, chapter 12, pages 715-764. Kluwer Academic Publishers,, 1984. 18

    D.. Hawking and P. Thistlewaite. Proximity operators—So near and yet so far. In D.. Harman, editor, Proceedings of the Fourth Text REtrieval Conference (TREC-4), pagess 131-143. NIST Special Publication 500-236,1995. 69

    D.. Hawking and P. Thistlewaite. Relevance weighting using distance between term occurrences.. Technical Report TR-CS-96-08, Department of Computer Science, Australiann National University, 1996. 69, 70

  • Bibliography y

    D.. Hays. Automatic language data processing. In H. Borko, editor, Computer Appli-cationscations in the Behavioral Sciences, pages 394-423. Prentice-Hall, 1962. 32

    M.. Hearst. Automated discovery of wordnet relations. In C. Fellbaum, editor, Word-Net:Net: An Electronical Lexical Database, chapter 5, pages 131-151. MIT Press, 1998. 139 9

    U.. Hermjakob. Parsing and question classification for question answering. In Pro-ceedingsceedings of the Workshop on Open-Domain Question Answering at ACL-2001., 2001. 6 6

    J.. Hertz, R. Palmer, and A. Krogh. Introduction to the Theory of Neural Computation. Addison-Wesley,, 1991. 107

    M.. Hollander and D. Wolfe. Nonparametric Statistical Methods. John Wiley and Sons, 1973.. 54

    E.. Hovy, L. Gerber, U. Hermjakob, M. Jink, and C. Lin. Question answering in Webclopedia.. In E. Voorhees and D. Harman, editors, Proceedings of the Ninth Text REtrievalREtrieval Conference (TREC-9), pages 655-664. NIST Special Publication 500-249, 2000.. 43

    E.. Hovy, L. Gerber, U. Hermjakob, C. Lin, and D. Ravichandran. Toward semantics-basedd answer pinpointing. In Proceedings of the DARPA Human Language Technol-ogyogy conference (HLT-2001), pages 339-345, 2001. 122

    D.. Hull. Using statistical testing in the evaluation of retrieval experiments. In Pro-ceedingsceedings of the 16th Annual International ACM SIGIR Conference on Research and De-velopmentvelopment in Information Retrieval, pages 329-38,1993. 54

    D.. Hull. Stemming algorithms—a case study for detailed evaluation. Journal of the AmericanAmerican Society for Information Science, 47(l):70-84,1996. 49

    A.. Ittycheriah, M. Franz, and S. Roukos. IBM's statistical question answering system—TREC-10.. In E. Voorhees and D. Harman, editors, Proceedings of the Tenth TextText REtrieval Conference (TREC 2001), pages 258-264. NIST Special Publication 500-250,2001.. 65

    B.. Jansen, A. Spink, and T. Saracevic. Real life, real users, and real needs: A study andd analysis of user queries on the web. Information Processing and Management, 36(2):207-227,, 2000. 69

    L.. Karttunen. Syntax and semantics of questions. Linguistics and Philosophy, 1(1): 3-44,1977.. 19

    M.. Kaszkiel and J. Zobel. Passage retrieval revisited. In Proceedings of the 20th An-nualnual International ACM SIGIR Conference on Research and Development in Information Retrieval,Retrieval, pages 178-185,1997. 50

  • Bibliography y

    M.. Kaszkiel and J. Zobel. Term-ordered query evaluation versus document-ordered queryy evaluation for large document databases. In B. Crof, A. Moffat, C. van Rijsbergen,, R. Wilkinson, and J. Zobel, editors, Proceedings of the 21st Annual Inter-nationalnational ACM SIGIR Conference on Research and Development in Information Retrieval, pagess 343-344,1998. 75

    M.. Kaszkiel and J. Zobel. Effective ranking with arbitrary passages, journal of the AmericanAmerican Society for Information Science and Technology, 52(4):344-364, 2001. 50

    B.. Katz, J. Lin, and S. Felshin. Gathering knowledge for a question answering system fromm heterogeneous information sources. In Proceedings of the ACL 2001 Workshop onon Human Language Technology and Knowledge Management, 2001. 33

    E.. Keen. Term position ranking: Some new test results. In N. Belkin, P. Ingwersen, A.. Pejtersen, and E. Fox, editors, Proceedings of the 15th Annual International ACM SIGIRSIGIR Conference on Research and Development in Information Retrieval, pages 66-76, 1992.. 69,70

    S.. Keenan, A. Smeaton, and G. Keogh. The effect of pool depth on system evaluation inn TREC. Journal of the American Society for Information Science and Technology, 52 (7):570-574,2001.. 47

    M.. Kendall. A new measure of rank correlation. Biometrika, 30(l-2):81-93,1938. 52

    L.. Kitchens. Exploring Statistics: A Modern Introduction to Data Analysis and Inference. Brooks/Colee Publishing Company, 2nd edition, 1998. 54

    R.. Kohavi and M. Sahami. Error-based and entropy-based discretization of contin-uouss features. In Proceedings of the Second International Conference on Knowledge DiscoveryDiscovery and Data Mining, pages 114-119,1996. 108

    W.. Kraaij and R. Pohlmann. Viewing stemming as recall enhancement. In Proceed-ingsings of the 19th Annual International ACM SIGIR Conference on Research and Devel-opmentopment in Information Retrieval, pages 40-48,1996. 59,148

    J.. Kupiec. MURAX: A robust linguistic approach for question answering using ann on-line encyclopedia. In Proceedings of the 16th Annual International ACM SI-GIRGIR Conference on Research and Development in Information Retrieval, pages 181-190, 1993.. 35

    C.. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the web. In Pro-ceedingsceedings of the 10th Wordl Wide Web Conference (WWW'10), pages 150-161, 2001a. 90 0

    K.. Kwok, L. Grunfeld, N. Dinsl, and M. Chan. TREC-9 cross language, web and questionn answering track experiments using PIRCS. In E. Voorhees and D. Har-man,, editors, Proceedings of the Ninth Text REtrieval Conference (TREC-9), pages 419-429.. NIST Special Publication 500-249,2000. 70

  • Bibliography y

    K.. Kwok, L. Grunfeld, N. Dinsl, and M. Chan. TREC2001 question-answer, web and crosss language experiments using PIRCS. In E. Voorhees and D. Harman, editors, ProceedingsProceedings of the Tenth Text REtrieval Conference (TREC 2001), pages 452-465. NIST Speciall Publication 500-250,2001b. 43

    W.. Lehnert. The Process of Question Answering: A Computer Simulation of Cognition. Lawrencee Erlbaum Associates, 1978. 24, 26, 38,39,41

    W.. Lehnert. A computational theory of human question answering. In A. Joshi, B.. Webber, and I. Sag, editors, Elements of Discourse Understanding, pages 145-176. Cambridgee University Press, 1981. 38

    W.. Lehnert. Cognition, computers and car bombs: How Yale prepared me for the 90's.. In R. Schank and E. Langer, editors, Beliefs, Reasoning, and Decision Making: Psycho-logicPsycho-logic in Honor ofBobAbelson, pages 143-173. Lawrence Erlbaum Associates, 1994.. 41

    S.. Levinson. Pragmatics. Cambridge University Press, 1983. 22

    W.. Li. Question classification using language modeling. Technical Report IR-259, Centerr for Intelligent Information Retrieval, University of Massachusetts, 2002. 6

    X.. Li and D. Roth. Learning question classifiers. In Proceedings of the 19th International ConferenceConference on Computational Linguistics (COLING 2002), pages 556-562, 2002. 151

    D.. Lin. Dependency-based evaluation of minipar. In Proceedings of the Workshop on thethe Evaluation of Parsing Systems, 1998. 97,137

    D.. Lin and P. Pantel. Discovery of inference rules for question-answering. Natural LanguageLanguage Engineering, 7(4):343-360, 2001. 97

    J.. Lin, A. Fernandes, B. Katz, G. Marton, and S. Tellex. Extracting answers from the webb using data annotation and data mining techniques. In Notebook of the 11th TextText REtrieval Conference (TREC 2002), pages 474-482. NIST Publication, 2002. 32

    J.. Lin, D. Quan, V. Sinha, K. Bakshi, D. Huynh, B. Katz, and D. Karger. What makess a good answer? The role of context in question answering. In Proceed-ingsings of the Ninth IFIP TC13 International Conference on Human-Computer Interaction (INTERACT-2003),(INTERACT-2003), 2003. 151

    F.. Llopis, A. Ferrandez, and J. Vicedo. Passage selection to improve question an-swering.. In Proceedings of the COLING 2002 Workshop on Multilingual Summariza-tiontion and Question Answering, 2002. 45, 52,63, 68

    H.. Luhn. The automatic creation of literature abstracts. IBM Journal of Research and Development,Development, 2(2):159-165,1958. 68, 70

  • Bibliography y

    B.. Magnini, M. Negri, R. Prevete, and H. Tanev. Is it the right answer? exploiting webb redundancy for answer validation. In Proceedings of the 40th Annual Meeting ofof the Association for Computational Linguistics (ACL-2002), pages 425-432, 2002. 10

    B.. Magnini and R. Prevete. Exploiting lexical expansions and boolean compositions forr web querying. In ACL Workshop on Recent Advances in Natural Language Pro-cessingcessing and Information Retrieval, 2000. 120,121

    G.. Mann. Fine-grained proper noun ontologies for question answering. In Proceed-ingsings of SemaNet'02: Building and Using Semantic Networks, 2002. 139

    M.. Maybury. Toward a question answering roadmap. Technical report, MITRE, 2002.. 84

    G.. Miller. W O R D N E T: A lexical database for English. Communications of the ACM, 38(11):39-41,1995.. 9, 27, 38,135

    M.. Mitra, A. Singhal, and C. Buckley. Improving automatic query expansion. In ProceedingsProceedings of the 21st Annua! International ACM SIGIR Conference on Research and DevelopmentDevelopment in Information Retrieval, pages 206-214,1998. 50,148

    D.. Moldovan, M. Pa§a, S. Harabagiu, and M. Surdeanu. Performance issues and errorr analysis in an open-domain question answering system. ACM Transactions onon Information Systems (ToIS), 21{2):133-154, 2003. 45,134

    D.. Moldovan, M. Pa§ca, S. Harabagiu, and M. Surdeanu. Performance issues and errorr analysis in an open-domain question answering system. In Proceedings of the 40th40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 33^0,2002.. 45,134

    C.. Monz. Document retrieval in the context of question answering. In F. Sebastiani, editor,, Proceedings of the 25th European Conference on Information Retrieval Research (EC1R-03),(EC1R-03), Lecture Notes in Computer Science 2633, pages 571-579. Springer, 2003.. 64

    C.. Monz and M. de Rijke. Tequesta: The University of Amsterdam's textual ques-tionn answering system. In Proceedings of Tenth Text Retrieval Conference (TREC-10), pagess 513-522, 2001a. 5,133

    C.. Monz and M. de Rijke. The University of Amsterdam at CLEF 2001. In Proceed-ingsings of the Cross Language Evaluation Forum Workshop (CLEF 2001), pages 165-169, 2001b.. 49,136

    C.. Monz and M. de Rijke. Shallow morphological analysis in monolingual informa-tionn retrieval for Dutch, German and Italian. In C Peters, M. Braschler, J. Gonzalo, andd M. Kluck, editors, Proceedings of the 2nd Workshop of the Cross-Language Eval-uationuation Forum (CLEF 2001), LNCS 2406, pages 262-277. Springer Verlag, 2002. 49, 136 6

  • Bibliography y nsi i

    C.. Monz, J. Kamps, and M. de Rijke. The University of Amsterdam at TREC 2002. Inn E. Voorhees and L. Buckland, editors, Proceedings of the Eleventh Text REtrieval ConferenceConference (TREC 2002), pages 603-614. NIST Special Publication 500-251,2002. 5, 133 3

    C.. Mooney and R. Duval. Bootstrapping: A Nonparametric Approach to Statistical In-ference.ference. Sage Quantitative Applications in the Social Science Series No. 95. Sage Publications,, 1993. 55

    MSNN Search, h t t p: / / search. msn. com/. 12

    D.. Musser and A. Saini. STL Tutorial and Reference Guide: C++ Programming with the StandardStandard Template Library. Addison Wesley, 1996. 75

    S.. Na, I. Kang, S. Lee, and J. Lee. Question answering approach using a wordnet-basedd answer type taxonomy. In E. Voorhees and L. Buckland, editors, Proceedings ofof the Eleventh Text REtrieval Conference (TREC 2002), pages 512-519. NIST Special Publicationn 500-251, 2002. 43

    M.. Pa§ca. High-Performance Open-Domain Question Answering from Large Text Collec-tions.tions. PhD thesis, Southern Methodist University, 2001. 45, 91, 95,108,116

    D.. Palmer and M. Hearst. Adaptive sentence boundary disambiguation. In Proceed-ingsings of the Fourth Conference on Applied Natural Language Processing, pages 78-83, 1994.. 82

    H.. Peat and P. Willett. The limitations of term co-occurrence data for query expan-sionn in document retrieval systems. Journal of the American Society for Information ScienceScience (JASIS), 42(5):378-383,1991. 90

    C.. Peters, M. Braschler, J. Gonzalo, and M. Kluck, editors. Proceedings of the 2nd WorkshopWorkshop of the Cross-Language Evaluation Forum (CLEF 2001), Lecture Notes in Computerr Science 2406, 2002. Springer Verlag. 48

    A.. Phillips. A question-answering routine. Memo. 16, Artificial Intelligence Project, 1960.. 4,31

    M.. Porter. An algorithm for suffix stripping. Program, 14(3):130-137, 1980. 7, 49, 136 6

    J.. Prager, E. Brown, A. Coden, and D. Radev. Question-answering by predictive annotation.. In Proceedings of the 23rd Annual International ACM SIGIR Conference onon Research and Development in Information Retrieval, pages 184-191, 2000. 35,122

    J.. Prager, D. Radev, E. Brown, A. Coden, and V. Samn. The use of predictive anno-tationn for question answering in TREC8. In E. Voorhees and D. Harman, editors, ProceedingsProceedings of the Eighth Text REtrieval Conference (TREC-8), pages 399^10. NIST Speciall Publication 500-246,1999. 122

  • Bibliography y

    W.. Press,, B. Flannery, S. Teukolsky, and W. Vetterling. Numerical Recipes In C. Cam-bridgee University Press, 1988. 107

    Prise.. Z39.50/Prise 2.0. www. i t l . n i s t. gov/ iaui/894.02/works/papers/zp2/zp2. html,, Accessed in February 2003. 44, 47

    J.. Quinlan. Learning with continuous classes. In Proceedings of the 5th Australian Joint ConferenceConference on Artificial Intelligence (AY92), pages 343-348,1992. 109

    J.. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. 107,109

    B.. Raphael. SIR: A Computer Program for Semantic Information Retrieval. PhD thesis, MIT,, Mathematics Department, 1964. 37

    Y.. Rasolofo and j . Savoy. Term proximity scoring for key word-bases retrieval sys-tems.. In F. Sebastiani, editor, Proceedings of the 25th European Conference on Informa-tiontion Retrieval Research (ECIR-03), Lecture Notes in Computer Science 2633, pages 207-218.. Springer, 2003. 70

    D.. Ravichandran and E. Hovy. Learning surface text patterns for a question an-sweringg system. In Proceedings of the 40th Annual Meeting of the Association for ComputationalComputational Linguistics (ACL-2002), pages 41-47, 2002. 9

    D.. Ravichandran, A. Ittycheriah, and S. Roukos. Automatic derivation of surface textt patterns for a maximum entropy based question answering system. In Pro-ceedingsceedings of the 3rd Human Language Technology Conference and the 3rd Meeting of the NorthNorth American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003),2003), pages 85-87, 2003. 9

    J.. Reynar and A. Ratnaparkhi. A maximum entropy approach to identifying sen-tencee boundaries. In Proceedings of the Fifth Conference on Applied Natural Language Processing,Processing, pages 16-19,1997. 82

    T.. Rietveld and R. van Hout. Statistical Techniques for the Study of Language and Lan-guageguage Behaviour. Mouton de Gryter, 1993. 55

    I.. Roberts. Information retrieval for question answering. Master's thesis, University off Sheffield, 2002. 45, 52

    S.. Robertson. The probability ranking principle in IR. Journal of Documentation, 33 (4):294-304,1977.. 94

    S.. Robertson and K. Sparck Jones. Relevance weighting of search terms. Journal of thethe American Society for Information Science, 27(3):129-146,1976. 94

    S.. Robertson and S. Walker. Okapi/Keenbow at TREC-8. In E. Voorhees and D. Har-man,, editors, Proceedings of the Eighth Text REtrieval Conference (TREC-8), pages 151-162.. NIST Special Publication 500-246,1999. 45, 50,148

  • Bibliography y

    S.. Robertson, S. Walker, and M. Beaulieu. Okapi at TREC-7: Automatic ad hoc, filtering,, VLC and interactive. In The 7th Text REtrieval Conference (TREC 7), pages 253-264.. NIST Special Publication 500-242,1998. 45, 70

    M.. Robnik-èikonja and I. Kononenko. An adaptation of relief for attribute estima-tionn on regression. In D. Fisher, editor, Proceedings of 14th International Conference onon Machine Learning ICML'97,1997. 114

    M.. Robnik-èikonja and I. Kononenko. Theoretical and empirical analysis of ReliefF andd RReliefF. Machine Learning, to appear, 2003. 114

    P.. Roget. Roget's International Thesaurus. Thomas Y. Crowell Publisher, 1946. 33

    A.. Roventini, A. Alonge, F. Bertagna, N. Calzolari, B. Magnini, and R. Martinelli. Italwordnet,, a large semantic database for Italian. In Proceedings of the 2nd Confer-enceence on Language Resources and Evaluation (LREC-2000), pages 783-790, 2000. 121

    G.. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. InformationInformation Processing and Management, 24(5):513-523,1988. 8, 57, 58, 68,126

    G.. Salton, C. Buckley, and C. Yu. An evaluation of term dependence models in informationn retrieval. In Proceedings of the 5th Annual International ACM SIGIR ConferenceConference on Research Research and Development in Information Retrieval, pages 51-173,1982. 94 4

    G.. Salton and M. McGill . Introduction to Modern Information Retrieval. McGraw-Hill, 1983.. 80

    G.. Sampson. English for the Computer—The SUSANNE Corpus and Analytic Scheme. Clarendonn Press, 1995. 97

    M.. Sanderson. Retrieving with good sense. Information Retrieval, 2(l):49-69, 2000. 121 1

    B.. Santorini. Part-of-speech tagging guidelines for the Penn Treebank. Deptartement of Computerr Science, University of Pennsylvania, 3rd revision, 2nd printing edition, 1990.. 95

    J.. Savoy. Statistical inference in retrieval effectiveness evaluation. Information Pro-cessingcessing and Management, 33(4):495-512,1997. 54

    R.. Scha. Logical Foundations for Question Answering. PhD thesis, University of Groningen,, 1983. 2,21,29

    R.. Schank. Conceptual Information Processing. Elsevier Science Publishers, 1975. 39

    H.. Schmid. Probabilistic part-of-speech tagging using decision trees. In Proceedings ofof International Conference on New Methods in Language Processing, 1994. 49,82,95, 135 5

  • Bibliography y

    S.. Scott and R. Gaizauskas. University of Sheffield TREC-9 Q&A system. In E.. Voorhees and D. Harman, editors, Proceedings of the Ninth Text REtrieval Con-ferenceference (TREC-9), pages 635-XX. NIST Special Publication 500-249, 2000. 45

    J.. Searle. Speech Acts. Cambridge University Press, 1969. 22

    S.. Siegel and N. Castellan. Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill,, 2nd edition, 1988. 54,129

    R.. Simmons. Answering english questions by computer: A survey. Communications ofof the ACM, 8(l):53-70, 1965. 17, 27

    R.. Simmons. Natural language question answering systems: 1969. Communications ofof the ACM, 13(l):15-30,1969. 27

    R.. Simmons, S. Klein, and K. McConlogue. Indexing and dependency logic for answeringg english questions. American Documentation, 15(3):196-204,1963. 4,32

    A.. Singhal, G. Salton, M. Mitra, and C. Buckley. Document length normalization. InformationInformation Processing & Management, 32(5):619-633,1996. 57,58, 69

    MM Soubbotin and S. Soubbotin. Patterns of potential answer expressions as clues to thee right answer. In E. Voorhees and D. Harman, editors, Proceedings of the Tenth TextText REtrieval Conference (TREC 2001), pages 134-143. NIST Special Publication 500-250,2001.. 9

    M.. Soubbotin and S. Soubbotin. Use of patterns for detection of likely answer strings:: A systematic approach answer. In Notebook of the 11th Text REtrieval Con-ferenceference (TREC 2002), pages 175-182. NIST Special Publication, 2002. 9

    K.. Sparck Jones. Automatic indexing. Journal of Documentation, 30(4):393-432,1974. 54 4

    K.. Sparck Jones. Automatic language and information processing: Rethinking eval-uation.. Natural Eangiiage Engineering, 7(1):29^16, 2001. 51

    K.. Sparck Jones and C. van Rijsbergen. Report on the need for and provision of ann ideal information retrieval test collection. Technical Report British Library Re-searchh and Development Report 5266, Computer Laboratory, University of Cam-bridge,, 1975. 46

    R.. Srihari and W. Li. Information extraction supported question answering. In E.. Voorhees and D. Harman, editors, Proceedings of the Eighth Text REtrieval Con-ferenceference (TREC-8), pages 185-196. NIST Special Publication 500-246,1999. 3

    J.. Suzuki, H. Taira, Y. Sasaki, and E. Maeda. Question classification using HDAG kernel.. In Proceedings of the ACE 2003 Workshop on Midtilingual Summarization and QuestionQuestion Answering, 2003. 6,151

  • Bibliography y

    S.. Tellex. Pauchok: A modular framework for question answering. Master's thesis, Massachusettss Institute of Technology, 2003. 45,134

    S.. Tellex, B. Katz, J. Lin, A. Fernandes, and G. Marton. Quantitative evaluation of passagee retrieval algorithms for question answering. In Proceedings of the 26th An-nualnual International ACM S1GIR Conference on Research and Development in Information Retrieval,Retrieval, pages 41^7, 2003. 45,134

    J.. Thorne. Automatic language analysis. ASTIA 297381, Final Technical Report, Arlington,, Va„ 1962. 4,33

    C.. van Rijsbergen. Information Retrieval. Butterworths, 2nd edition, 1979. 54

    J.. Vicedo, L. Llopis, and A. Ferrandez. University of Alicante experiments at TREC 2002.. In Notebook of the 11th Text REtrieval Conference (TREC 2002), pages 557-564. NISTT Special Publication, 2002. 50

    E.. Voorhees. Query expansion using lexical-semantic relations. In Proceedings of thethe 17th Annual International ACM-SIGIR Conference on Research and Development in InformationInformation Retrieval, pages 61-69,1994. 90

    E.. Voorhees. Overview of the TREC-9 question answering track. In E. Voorhees and D.. Harman, editors, Proceedings of the Ninth Text REtrieval Conference (TREC-9), pagess 71-80. NIST Special Publication 500-249, 2000a. 12, 20, 84, 85, 86

    E.. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness.. Information Processing and Management, 36(5):697-716,2000b. 48

    E.. Voorhees. Evaluation by highly relevant documents. In Proceedings of the 24th An-nualnual International ACM SIGIR Conference on Research Research and Development in Information Retrieval,Retrieval, pages 74-82, 2001a. 48

    E.. Voorhees. Overview of the TREC 2001 question answering track. In E. Voorhees andd D. Harman, editors, Proceedings of the Tenth Text REtrieval Conference (TREC 2001),2001), pages 42-51. NIST Special Publication 500-250, 2001b. 12, 86

    E.. Voorhees. The TREC question answering track. Natural Language Engineering, 7 (4):361-378,2001c.. 4, 71

    E.. Voorhees. Overview of the TREC 2002 question answering track. In Notebook of thethe 11th Text REtrieval Conference (TREC 2002), pages 115-123. NIST, 2002. 12, 21, 40,86 6

    E.. Voorhees. Evaluating the evaluation: A case study using the TREC 2002 question answeringg track. In Proceedings of the 3rd Human Language Technology Conference andand the 3rd Meeting of the North American Chapter of the Association for Computational LinguisticsLinguistics (HLT-NAACL 2003), pages 260-267,2003. 49

  • Bibliography y

    E.. Voorhees and C. Buckley. The effect of topic set size on retrieval experiment error. Inn Proceedings of the 25th Annual International ACM S1GIR Conference on Research andand Development in Information Retrieval, pages 316-323, 2002. 49

    E.. Voorhees and D. Harman. Overview of the sixth text retrieval conference (trec-6).. In E. Voorhees and D. Harman, editors, Proceedings of the Sixth Text REtrieval ConferenceConference (TREC 6), pages 1-24. NIST Special Publication 500-240,1997. 79

    E.. Voorhees and D. Harman. Overview of the seventh text retrieval conference (TREC-7).. In E. Voorhees and D. Harman, editors, Proceedings of the Seventh Text REtrievalREtrieval Conference (TREC-7), pages 1-23. NIST Special Publication 500-242,1998. 48,52 2

    E.. Voorhees and D. Harman. Overview of the eigth text retrieval conference {TREC-7).. In E. Voorhees and D. Harman, editors, Proceedings of the Eigth Text REtrieval ConferenceConference (TREC-7), pages 1-24. NIST Special Publication 500-246,1999. 48

    E.. Voorhees and D. Tice. Building a question answering test collection. In Proceedings ofof the 23rd Annual International ACM-SIGIR Conference on Research and Development inin Information Retrieval, pages 200-207,2000a. 12, 84,140

    E.. Voorhees and D. Tice. The TREC-8 question answering track evaluation. In E.. Voorhees and D. Harman, editors, Proceedings the Eighth Text REtrieval Con-ferenceference (TREC-8), pages 83-105. NIST Special Publication 500-246, 2000b. 12

    P.. Vossen, editor. EuroWordNet: A Multilingual Database with Lexical Semantic Net-works.works. Kluwer Academic Publishers, 1998. 121

    Y.. Wang and I. Witten. Induction of model trees for predicting continuous classes. Inn Proceedings of the Poster Papers of the European Conference on Machine Learning (ECML),(ECML), pages 128-137,1997. 109

    E.. Wendlandt and J. Driscoll. Incorporating a semantic analysis into a document retrievall strategy. In Proceedings of the 14th Annual International ACM S1GIR Con-ferenceference on Research and Development in Information Retrieval, pages 270-279, 1991. 34,35 5

    J.. Wilbur. Non-parametric significance tests of retrieval performance comparisons. JournalJournal of Information Science, 20(4):270-284,1994. 54

    R.. Wilkinson, J. Zobel, and R. Sacks-Davis. Similarity measures for short queries. In D.. Harman, editor, The Fourth Text REtrieval Conference (TREC-4), pages 277-286. NISTT Special Publication 500-236,1995. 81

    E.. Williams. On the notions 'lexically related' and 'head of a word'. Linguistic In-quiry,quiry, 12:245-274,1981. 97

  • Bibliography y

    I.. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques withwith Java Implementations. Morgan Kaufmann, 1999. 109

    I.. Witten, A. Moffat, and T. Bell. Managing Gigabytes: Compressing and Indexing Doc-umentsuments and Images. Morgan Kaufmann Publishing, 2nd edition, 1999. 47, 75

    W.. Woods. Transition network grammars for natural language analysis. Communi-cationscations of the ACM, 13(10):591-606,1970. 28

    W.. Woods. Lunar rocks in natural English: Explorations in natural language ques-tionn answering. In A. Zampoli, editor, Linguistic Structures Processing, pages 521-569.. Elsevier North-Holland, 1977. 2, 28

    H.. Xu and H. Zhang. ICT experiments in TREC 11 QA main task. In Notebook of thethe 11th Text REtrieval Conference (TREC 2002), pages 296-298. NIST Publication, 2002.. 50

    J.. Xu and B. Croft. Query expansion using local and global document analysis. In ProceedingsProceedings of the 19th Annual International ACM SIGIR Conference on Research and DevelopmentDevelopment in Information Retrieval, pages 4-11,1996. 65, 90,148,151

    J.. Xu, A. Licuanan, J. May, S. Miller, and R. Weischedel. TREC 2002 qa at BBN: Answerr selection and confidence estimation. In Notebook of the 11th Text REtrieval ConferenceConference (TREC 2002), pages 290-295. NIST Publication, 2002. 32

    H.. Yang and T. Chua. The integration of lexical knowledge and external resources forr question answering. In Notebook of the 11th Text REtrieval Conference (TREC 2002),2002), pages 155-161. NIST Publication, 2002. 121

    H.. Yang, T. Chua, S. Wang, and C. Koh. Structured use of external knowledge for event-basedd open domain question answering. In Proceedings of the 26th Annual InternationalInternational ACM SIGIR Conference on Research and Development in Information Re-trieval,trieval, pages 33-40, 2003. 121

    D.. Zhang and W. Lee. Question classification using support vector machines. In ProceedingsProceedings of the 26th Annual International ACM SIGIR Conference on Research and DevelopmentDevelopment in Information Retrieval, pages 26-32, 2003. 6,151

    J.. Zobel. How reliable are the results of large-scale information retrieval experi-ments?? In B. Crof, A. Moffat, C. van Rijsbergen, R. Wilkinson, and J. Zobel, edi-tors,, Proceedings of the 21st Annual International ACM SIGIR Conference on Research andand Development in Information Retrieval, pages 307-314,1998. 47

    J.. Zobel and A. Moffat. Exploring the similarity space. ACM SIGIR Forum, 32(1): 18-34,1998.. 8,58