a trainable multi-factored qa system
DESCRIPTION
A Trainable Multi-factored QA System. Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia, Verginica Barbu-Mititelu. Research Institute for Artificial Intelligence, Romanian Academy. ResPubliQA. We participated in the Romanian-Romanian ResPubliQA task - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/1.jpg)
A Trainable Multi-factored QA System
Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia,
Verginica Barbu-Mititelu
Research Institute for Artificial Intelligence, Romanian Academy
![Page 2: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/2.jpg)
ResPubliQA
• We participated in the Romanian-Romanian ResPubliQA task
• 500 juridical questions to be answered from the Romanian JRC Acquis (10714 docs)
• Questions have been translated from other languages => a more difficult QA task since translated terms are not necessarily expressed the same in the actual Romanian documents
![Page 3: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/3.jpg)
Corpus processing and indexing
• POS tagging, lemmatization, chunking.
• Only the ‘body’ part of a document was indexed (no annexes, no headers)
• We have two Lucene indexes: a document index and a paragraph index
• What’s in the index: lemmas and paragraph classes for the paragraph index
![Page 4: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/4.jpg)
QA flow
• Web services based:– Question preprocessing using TTL (
http://ws.racai.ro/ttlws.wsdl)
– Question classification using a ME classifier (http://shadow.racai.ro/JRCACQCWebService/Service.asmx)
– Query generation (2 types: TFIDF and chunk based) (http://shadow.racai.ro/QADWebService/Service.asmx)
– Search engine interrogation (http://www.racai.ro/webservices/search.asmx)
– Paragraph relevance score computation and paragraph reordering
![Page 5: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/5.jpg)
The combined QA system
• In order to account for NOA strings (which, when given, will increase the overall performance measure) we decided to combine 2 results:– The QA system using the TFIDF query– The QA system using the chunk query
• When the same paragraph was returned among the top K (=3) paragraphs by the two systems, it was the answer
• For the other case, we returned the NOA string
![Page 6: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/6.jpg)
Paragraph relevance
• s1 to s5 are paragraph relevance scores• λi are trained weights by iteratively computing
MRR scores on a 200 questions test set using sets of weights for which the sum is 1.
• Retaining the value of the weights that account for the largest obtained MRR, results in a MERT-like training procedure
• Increment step was 0.01
![Page 7: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/7.jpg)
Relevance scores
• Lucene scores for the document and paragraph retrieval
• One BLUE-like relevance score which is high if a candidate paragraph contains keywords much in same order as in the question
• One indicator variable that is 1 if the candidate paragraph has the same class as the question (0 otherwise)
• One lexical chains based score (a real number quantified semantic distance between the question and the candidate paragraph)
![Page 8: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/8.jpg)
Evaluations• Official results• Second run: query contained the question class
![Page 9: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/9.jpg)
Post CLEF2009 Evaluations
• Results with all questions (500) answered (no NOA strings)• With trained parameters for every question class, we obtain an
overall accuracy of 0.5774 (29 additional correctly answered questions)
![Page 10: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/10.jpg)
Post CLEF2009 Evaluations (II)
• Some other informative measures:– Answering precision: correct / answered– Rejection precision: (1 – correct) / unanswered
• AP(icia092roro) = 75.58%• RP(icia092roro) = 86.53%• So, the system is able to reject giving wrong
answers at a high rate which is a merit in itself (discovered due to the c@1 calculus) even if it cannot offer the same answering precision in the unanswered area
![Page 11: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/11.jpg)
Conclusions
• A multi-factored QA system may be easily extended with new paragraph relevance scores
• It’s also easily adaptable on new domains and/or languages
• Update: better correlation between documents and paragraph relevance scores
• Future plans: to develop the English QA system along the same lines and combine the En-Ro outputs
![Page 12: A Trainable Multi-factored QA System](https://reader036.vdocuments.site/reader036/viewer/2022062309/5681366c550346895d9df8a0/html5/thumbnails/12.jpg)
Conclusions (II)
• Competition drives innovation but let’s not forget that these tools are there to help users
• Useful requirement: QA systems to be on the Web
• Ours is at http://www2.racai.ro/sir-resdec/