pei- ning chen ntnu csie slp lab
DESCRIPTION
Effects of Query Expansion for Spoken Document Passage Retrieval Tomoyosi Akiba , Koichiro Honda INTERSPEECH 2011. Pei- Ning Chen NTNU CSIE SLP Lab. Outline. Introduction Passage Retrieval for Spoken Document Query Expansion for SDR Experiments Conclusions. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/1.jpg)
Effects of Query Expansion for Spoken Document Passage Retrieval
Tomoyosi Akiba, Koichiro HondaINTERSPEECH 2011
Pei-Ning ChenNTNU CSIE SLP Lab
![Page 2: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/2.jpg)
Outline
• Introduction • Passage Retrieval for Spoken Document• Query Expansion for SDR• Experiments• Conclusions
![Page 3: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/3.jpg)
Introduction
• Because confirming the content of a spoken document requires playing back its audio data, browsing speech data is much more difficult and time-consuming than browsing textual data.
• They apply relevance models, a query expansion method, for the spoken document passage retrieval task. They adapted the original relevance model for passage retrieval, and also extended it to benefit from massive collections of Web documents for query expansion.
![Page 4: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/4.jpg)
Retrieval Methods for Passage Retrieval
• Using the Neighboring Context to Index the Passage– Passages from the same lecture may be related to each
other in the passage retrieval task, whereas the target documents are considered to be independent of each other in a conventional document retrieval task.
• Penalizing Neighboring Retrieval Results– In applying context indexing, neighboring passages are
liable to be retrieved at the same time as they share the same indexing words.
![Page 5: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/5.jpg)
Query Expansion for SDR
• Relevance Models
• Extending Relevance Models to Context Indexing
• Extending Relevance Models using Web
![Page 6: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/6.jpg)
• Linear interpolation: • the two models are linearly interpolated:
• Document weighting: • the Web model is used to weight the target documents:
![Page 7: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/7.jpg)
Experiments
![Page 8: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/8.jpg)
Experiments
![Page 9: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/9.jpg)
Conclusions• They applied relevance models for the spoken
document passage retrieval task.• They also extended it to take advantage of the
massive collection of Web documents for query expansion.
• In order to improve the performance of their Web extension of relevance models, filtering for noisy Web documents might be necessary.
• In future work, we will apply Web document filtering methods to select only the documents most related to the target documents.
![Page 10: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/10.jpg)
Speech Indexing Using Semantic Context InferenceChien-Lin Huang, Bin Ma, Haizhou Li and Chung-Hsien Wu
INTERSPEECH 2011
![Page 11: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/11.jpg)
Outline
• Introduction • Semantic Context Inference • Experiments• Conclusions
![Page 12: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/12.jpg)
Introduction
• The indexing techniques of text-based information retrieval have been widely adopted in spoken document retrieval
• However, due to imperfect speech recognition results, out-of vocabulary, and the ambiguity in homophone and word tokenization, conventional text-based indexing techniques are not always appropriate for spoken document retrieval
![Page 13: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/13.jpg)
Semantic Context Inference(SCI)
• They proposed the semantic context inference representation by finding the semantic relation between terms, and suggesting semantic term expansion for speech indexing
![Page 14: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/14.jpg)
Semantic relation matrix• A spoken document database comprises an accu-
mulation of spoken documents from which the document-by-term matrix
![Page 15: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/15.jpg)
SCI for indexing
• By summing up all the semantic inference vectors for the spoken document d, we finally obtain the semantic context inference vector
![Page 16: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/16.jpg)
Retrieval model
• For spoken document retrieval, we adopt the vector space models which have been widely used in information retrieval by offering a highly efficient retrieval with a feature vector representation for a document
![Page 17: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/17.jpg)
Experiments
• To measure the accuracy of retrieved documents and the ranking position of the relevant document, they use the mean average precision to evaluate.
![Page 18: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/18.jpg)
![Page 19: Pei- Ning Chen NTNU CSIE SLP Lab](https://reader035.vdocuments.site/reader035/viewer/2022062310/568161e6550346895dd20cbb/html5/thumbnails/19.jpg)
Conclusions
• The proposed semantic context inference explores the latent semantic information and extends the semantic related terms to speech indexing. The semantic context inference vector can be regarded as a re-weighing indexing vector which is a way of query expansion to overcome speech recognition errors.