(spoken) dialogue and information retrieval antoine raux dialogs on dialogs group 10/24/2003
TRANSCRIPT
![Page 1: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/1.jpg)
(Spoken) Dialogue(Spoken) Dialogueand Information Retrievaland Information Retrieval
Antoine RauxAntoine Raux
Dialogs on Dialogs GroupDialogs on Dialogs Group
10/24/200310/24/2003
![Page 2: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/2.jpg)
OutlineOutline
Interactive Information Retrieval Interactive Information Retrieval Systems (Belkin et al)Systems (Belkin et al)
EUREKA: Dialogue-based IR for Low EUREKA: Dialogue-based IR for Low Bandwidth DevicesBandwidth Devices
Voice Access to IRVoice Access to IR
![Page 3: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/3.jpg)
Cases, Scripts, and Cases, Scripts, and Information-Seeking StrategiesInformation-Seeking Strategies
Belkin, Cool (Rutgers)Belkin, Cool (Rutgers)Stein, Thiel (GMD-IPSI)Stein, Thiel (GMD-IPSI)
Long journal article (1995)Long journal article (1995)
From the IR community (Expert From the IR community (Expert Systems)Systems)
![Page 4: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/4.jpg)
IR as InteractionIR as Interaction
Traditional IR research focuses on Traditional IR research focuses on document/query document/query representationrepresentation and and comparisoncomparison
Need to focus on the Need to focus on the useruser Represent IR as a Represent IR as a dialoguedialogue between between
an an information seekerinformation seeker and an and an information providerinformation provider
![Page 5: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/5.jpg)
Information-Seeking Information-Seeking StrategiesStrategies
Represent information-seeking behavior Represent information-seeking behavior along 4 dimensions:along 4 dimensions: Method of Interaction (scanning vs searching)Method of Interaction (scanning vs searching) Goal of Interaction (learning vs selecting)Goal of Interaction (learning vs selecting) Mode of Retrieval (recognition vs specification)Mode of Retrieval (recognition vs specification) Resource Considered (information vs meta-Resource Considered (information vs meta-
info)info) Binary values Binary values 16 strategies (ISS) 16 strategies (ISS)
![Page 6: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/6.jpg)
Dialogue Structures for Dialogue Structures for Information SeekingInformation Seeking
Mix of different formalisms:Mix of different formalisms: Recursive state-based schemas (COR)Recursive state-based schemas (COR)
e.g. e.g. Request Request Promise Promise Inform Inform Be contented Be contented
Scripts: prototypical interaction for each Scripts: prototypical interaction for each ISSISS
Goal treesGoal treesRetrieve Specified Items
Specify Characteristic Recognize Desired Items
Offer choice Select and Specify
![Page 7: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/7.jpg)
Deriving Scripts from DataDeriving Scripts from Data
Case-based approach: problem Case-based approach: problem solving using previously stored solving using previously stored solved instancessolved instances
Match a sequence of action to a Match a sequence of action to a state-based schemastate-based schema
Extract goal treeExtract goal tree Identify goal (which ISS?)Identify goal (which ISS?)
![Page 8: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/8.jpg)
The MERIT SystemThe MERIT System
Theory vs Practice…Theory vs Practice… Graphical interface (not NL dialogue)Graphical interface (not NL dialogue) User does case selection (for User does case selection (for
eventual case-based reasoning)eventual case-based reasoning) Example task is relational database Example task is relational database
(not free text IR): uses form filling (!) (not free text IR): uses form filling (!)
![Page 9: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/9.jpg)
DiscussionDiscussion
Contribution to IR: user-centered Contribution to IR: user-centered view, application of many non-IR view, application of many non-IR theories (discourse, CBR)theories (discourse, CBR)
BUT: too complicated for the user?BUT: too complicated for the user?
![Page 10: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/10.jpg)
DiscussionDiscussion
Contribution to Dialogue Systems: Contribution to Dialogue Systems: difficult task (not often dealt with in difficult task (not often dealt with in DS), CBR (can we learn dialogue DS), CBR (can we learn dialogue structure from data?)structure from data?)
BUT: lacks a good, unified, practical BUT: lacks a good, unified, practical framework (too many different framework (too many different paradigms applied…)paradigms applied…)
![Page 11: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/11.jpg)
Dialogue-based IR: Why?Dialogue-based IR: Why?
Google-like interface still predominant Google-like interface still predominant (despite MERIT)(despite MERIT)
Why?Why? Users receives a lot of information Users receives a lot of information
(document titles, summaries) and use it (document titles, summaries) and use it as they wantas they want
Very simple to learnVery simple to learn Very flexibleVery flexible BUT: works on BUT: works on high bandwidth deviceshigh bandwidth devices
![Page 12: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/12.jpg)
Dialogue-based IR: Why?Dialogue-based IR: Why?
For For low bandwidth deviceslow bandwidth devices (PDA, (PDA, phone), information-rich interface phone), information-rich interface don’t workdon’t work
Only small pieces of information Only small pieces of information exchanged at a timeexchanged at a time
System has to System has to selectselect Less information, more interactionLess information, more interaction
![Page 13: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/13.jpg)
EUREKA: IdeaEUREKA: Idea
Use dialogue to submit queries to a Use dialogue to submit queries to a web search engine, browse through web search engine, browse through the hierarchically clustered results, the hierarchically clustered results, perform query perform query reformulation/refinement, etc…reformulation/refinement, etc…
![Page 14: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/14.jpg)
EUREKA: OverviewEUREKA: Overview
Backend: Backend: VivisimoVivisimo (through web (through web scraper)scraper)
Dialogue Management: Dialogue Management: RavenClawRavenClaw (successor of CMU Communicator)(successor of CMU Communicator)
Language Understanding: Language Understanding: Light Open Light Open Vocabulary ParserVocabulary Parser
NLG/TTS: template-based & FestivalNLG/TTS: template-based & Festival
![Page 15: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/15.jpg)
Backend: VivisimoBackend: Vivisimo
Available clustering meta-search Available clustering meta-search engineenginewww.vivisimo.comwww.vivisimo.com
Hand-written Perl web scraper Hand-written Perl web scraper (hope Vivisimo doesn’t change their (hope Vivisimo doesn’t change their page design by the end of the page design by the end of the semester…)semester…)
![Page 16: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/16.jpg)
LOV ParserLOV Parser
Problem: traditional NL parsers require a Problem: traditional NL parsers require a dictionary dictionary not applicable to open not applicable to open domain IRdomain IR
Solution (implemented in C++):Solution (implemented in C++): fix a small number of one-word commands fix a small number of one-word commands
(new_query, open, list_clusters)(new_query, open, list_clusters) parse each line as “[command] parse each line as “[command]
[arguments]” or “[command]” or [arguments]” or “[command]” or “[arguments]”“[arguments]”
![Page 17: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/17.jpg)
Dialogue Management: Dialogue Management: RavenClawRavenClaw
Hierarchical agent architecture:Hierarchical agent architecture:
EUREKA
Greet UserPromptQuery
New QueryOpen Cluster
…
SubmitQuery
GetCluster List
GetDoc List
InformResults
CloseCluster
![Page 18: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/18.jpg)
NLG/TTSNLG/TTS
Template-based Language Generation Template-based Language Generation (e.g. “I found <n_doc> documents.”)(e.g. “I found <n_doc> documents.”)
General purpose Festival voice for TTSGeneral purpose Festival voice for TTS
NB: NB: Browsing through lists is Browsing through lists is not efficientnot efficient with speech, even for lists of clusterswith speech, even for lists of clusters
![Page 19: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/19.jpg)
Already ImplementedAlready Implemented
Working prototypeWorking prototype Commands:Commands:
new_querynew_query list_clusters, list_documentslist_clusters, list_documents open, close (cluster)open, close (cluster) more, back (list of clusters/documents)more, back (list of clusters/documents)
![Page 20: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/20.jpg)
DemoDemo
![Page 21: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/21.jpg)
Future WorkFuture Work
Add more functionalities (query Add more functionalities (query refinement, summarization…)refinement, summarization…)
Make Make cleverclever use of the dialogue (not use of the dialogue (not only command and control + browsing)only command and control + browsing) System can provide advice to user on System can provide advice to user on
search strategies (e.g. “you need to refine search strategies (e.g. “you need to refine the query”)the query”)
User and system can negotiate to specify User and system can negotiate to specify the user’s information needthe user’s information need(cf Belkin: overview vs specific document)(cf Belkin: overview vs specific document)
![Page 22: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/22.jpg)
Future Work/DiscussionFuture Work/Discussion
Advantage of dialogue: more Advantage of dialogue: more feedback from the userfeedback from the user
How can dialogue improve the How can dialogue improve the efficiency of low bandwidth IR?efficiency of low bandwidth IR?
Do we need to tailor IR techniques Do we need to tailor IR techniques (e.g. clustering) for dialogue, or even (e.g. clustering) for dialogue, or even design new techniques?design new techniques?
![Page 23: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/23.jpg)
Vocal Access to IRVocal Access to IR
Problem: ASR introduces a lot of Problem: ASR introduces a lot of erroneous words in a spoken query erroneous words in a spoken query (for an open domain, speaker (for an open domain, speaker independent system)independent system)
However, in an IR system: access to However, in an IR system: access to many text documents to help many text documents to help language modeling…language modeling…
![Page 24: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/24.jpg)
Vocal Access to a Newspaper Vocal Access to a Newspaper Archive (Crestani 02)Archive (Crestani 02)
Presents studies for a full voice-controlled IR Presents studies for a full voice-controlled IR systemsystem
No dialogue: No dialogue: user query user query list of summaries list of summaries
Focuses on issues of:Focuses on issues of: TTS: can user make relevance judgments when TTS: can user make relevance judgments when
they hear document descriptions synthesized they hear document descriptions synthesized over the phone? (answer: yes)over the phone? (answer: yes)
ASR: how does IR perform with recognized ASR: how does IR perform with recognized queries?queries?
![Page 25: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/25.jpg)
Using IR Techniques to Deal Using IR Techniques to Deal with Recognition Errorswith Recognition Errors
WER does have an impact on precision, WER does have an impact on precision, although not much variation for WER in although not much variation for WER in 27%-47%27%-47%
Relevance feedback: use documents Relevance feedback: use documents judged relevant by the user as queryjudged relevant by the user as query
Use prosodic stress to estimate Use prosodic stress to estimate information content of query termsinformation content of query terms
Include semantically/phonetically close Include semantically/phonetically close terms in the queryterms in the query
![Page 26: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/26.jpg)
Improving ASR (Fujii et al Improving ASR (Fujii et al 02)02)
Fujii et al propose LM adaptation based on Fujii et al propose LM adaptation based on the IR corpus:the IR corpus: Offline “adaptation”: train on the whole corpusOffline “adaptation”: train on the whole corpus Online adaptation: adapt on the top retrieved Online adaptation: adapt on the top retrieved
documents (then reperform ASR and IR)documents (then reperform ASR and IR) Good results with offline trained LM (WER Good results with offline trained LM (WER
< 20%, AP loss of 20-30% from text IR)< 20%, AP loss of 20-30% from text IR) No evaluation of online adaptation…No evaluation of online adaptation…
![Page 27: (Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003](https://reader035.vdocuments.site/reader035/viewer/2022081519/56649cf85503460f949c9a9f/html5/thumbnails/27.jpg)
Vocal Access to IR: Vocal Access to IR: DiscussionDiscussion
Seems to work ok for some tasksSeems to work ok for some tasks Clever use of IR techniquesClever use of IR techniques BUT queries are not spontaneous nor BUT queries are not spontaneous nor
natural (maybe)natural (maybe) LM for Web queries??LM for Web queries??
What about dialogue?What about dialogue?