preparing for the 2008 beijing olympics : the lingtour and knowlistics projects
DESCRIPTION
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN, Christian BOITET, Gérard CHOLLET, Alain GOYE, Eric LECOLINET, Jacques PRADO - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/1.jpg)
Preparing for the 2008 Beijing Olympics :
The LingTour and KNOWLISTICS projects
MAO Yuhang, DING Xiao-Qing, NI Yang,LIN Shiuan-Sung, Laurence LIKFORMAN,
Christian BOITET, Gérard CHOLLET, Alain GOYE, Eric LECOLINET, Jacques PRADO
Presented here by Gérard [email protected] GET-ENST/CNRS-LTCI
http://www.tsi.enst.fr/~chollet
![Page 2: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/2.jpg)
Outline
Rationale of the proposal Objectives
The Beijing 2008 Olympics Approaches
Multimedia, multilingual information server Information kiosk Intelligent Camera Bilingual Voice Communicator
Needs and relevance A PDA for tourists and travelling businessmen
Conclusions and Perspectives
![Page 3: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/3.jpg)
Rationale for the IP-KNOWLISTICS
Logistics for knowledge in a specific domain (OG) Language independent knowledge representation
and management Multimedia (text, speech, image, video) Multimodal access (text, speech, pen, visual I/O) Distributed multilingual, multimedia server
accessible from mobile terminals (phone, PDA, PC,…) and kiosks
Primarily targetted for tourist applications initially 2008 Beijing Olympics as a field trial
![Page 4: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/4.jpg)
Technical developments
Language independent knowledge representation (using conceptual graphs and an Intermediate Representation Language like the ‘Universal Networking Language’)
Tools for enconversion and evaluation Generation in 12 target languages Multilingual Speech Synthesis and Recognition VoiceUNL-based interactive dialog agent ‘Intelligent camera’ with Chinese character recognition Cross-language ‘Multimodal communicator’ on a PDA Cross-language lexical access
![Page 5: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/5.jpg)
Chinese character recognition
![Page 6: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/6.jpg)
Intelligent camera from Tsinghua Univ.
capturereco
translation
![Page 7: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/7.jpg)
Extracting text from scene images
• Complex color images • Uncontrolled illumination • Variations : size, fonts, orientation,
texture• Complex backgrounds, shadows
![Page 8: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/8.jpg)
Text extraction
Searching for character regions (text has uniform color) Multi-channel decomposition Connected components analysis Grouping of components Alignment analysis (number of horizontally or
vertically aligned components) Text identification (language independant features :
size, alignment,…)
Detection rate : 84 % False alarm rate : 5.6 %
![Page 9: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/9.jpg)
Cross-language Multimodal Communicator
Use of a visual display (e.g. on a PDA) to mediate the dialogue between 2 persons speaking different languages.
Recognition of short utterances, display of a word graph, selection of keywords, visualisation (and synthesis) of the translation of key words and groups of words.
Specialised lexicon for dialog acts in typical touristic situations (in a restaurant, at the hotel, medical assitance, in the street, in public transport, about the Olympic games,…)
UMTS access to an information server offering maps, photographs, video sequences, web browsing, …
![Page 10: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/10.jpg)
Automatic Speech Recognition in Multiple Languages
Sharing of acoustic models between languages to simplify extensibility to other languages.
Combination of phone models and adaptation from small amounts of data in new languages.
Model adaptation to user and environmental situations.
French
ChineseSharedacousticmodels
Language specific models
![Page 11: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/11.jpg)
Knowledge representation
A formal language for representating the meaning of natural language sentence.
UNL (Universal Networking Language) introduced to describe natural language semantics.
Language-independent context indexing for cross-language information retrieval.
Use of conceptual hierarchy of UNL to address the inherent ambiguity of natural languages.
A set of semantic relations (linking concepts together) for a structured information pattern.
![Page 12: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/12.jpg)
UNL representation
“The cat drank the milk”
agt(drink(icl>do,agt>thing, obj>liquid).@past.@entry,cat(icl>mammal>animal).@def)
obj(drink(icl>do,agt>thing, obj>liquid).@past.@entry,milk(icl>beverage>food).@def)
can be encoded by:
agt, obj are binary semantic relations
![Page 13: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/13.jpg)
Role of semantic contents representation in indexing
Digital AudioVideo
Textual
Cross lingualMultimedia platform
User’s request
UNL encoding
User specific informationUNL decoding
![Page 14: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/14.jpg)
Application architecture
UMTS server
Speech synthesis
Access information
a word graph,+ a list of keywordsTranslation
![Page 15: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/15.jpg)
Digital OlympicDigital OlympicMulti-Language Information Multi-Language Information
Network Service System ProjectNetwork Service System Project
![Page 16: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/16.jpg)
From VoiceXML to VoiceUNLand MultimediaUNL.
Presented here by Gérard [email protected] ENST/CNRS-LTCI
http://www.tsi.enst.fr/~chollet
With the contribution of Christian BOITET, Mutsuko TOMOYIKO and Catherine PELACHAUD
![Page 17: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/17.jpg)
Outline
Rationale of the proposition Objectives
Promotion of a new standard, demonstrations
Approaches An extra layer of VoiceXML
Need and relevance Multilingual Vocal Servers
Integration and structuring effect Conclusions and Perspectives
![Page 18: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/18.jpg)
Rationale for VoiceUNL
Need for Language Independent Vocal Servers, Need for a language independent
knowledge representation and management formalism
Principle of proposed solution: Start from UNL graphs augmented with
voice-oriented semantic marks (special UWs, attributes),
Generate in the target language, Voice-oriented marks become prosodic
markers, Final conversion to VoiceXML
2008 Beijing Olympics as a field trial
![Page 19: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/19.jpg)
What is VoiceXML ?
A recommendation of W3C (WWW Consortium) An extension of XML for vocal information
servers, A set of normalised markup tags, Current ags concern language identification,
voice prompting, speech synthesis, form filling, barge in, echo cancelling,…
No provision to access a semantically encoded data base,
Need for a UNL-type front-end Compatibility with MPEG4-SNHC (talking head)
![Page 20: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/20.jpg)
Applications
![Page 21: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/21.jpg)
Prosodic information in UNL
Attributes that can influence the grammatical and the prosodic structure of a sentence already exist: @emphasis @qfocus
Representations should be defined, concerning : Emotion: @angry, @bored, @relaxed…? Focus: grouping words to emphasize in a
scope? Passivity: @passive? Speaker: @age, @sex, special UWs for voice
characteristics…? Expression (for face and gesture animation):
special UWs/constructs?
![Page 22: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/22.jpg)
Conclusions and Perspectives
Demonstrations to be prepared within the LingTour, Normalangue and KNOWLISTICS projects
First target is the Beijing 2008 Olympics Some concept-oriented formalism
(such as Sowa's conceptual graphs) may be used to store knowledge
before building in UNL"interlingual prelinguistic, communicative
content"
![Page 23: Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects](https://reader035.vdocuments.site/reader035/viewer/2022062808/568154b2550346895dc2bbf9/html5/thumbnails/23.jpg)
Conclusions and Perspectives
UNL representation of meaning of natural language sentences directly available for retrieval, indexing and knowledge extraction.
UNL with multimedia contents (text, speech, image, video) and multimodal access (text, speech, visual I/O) to enrich the service for communication.
Comprehensive and extensive information service on PDAs with access to UMTS and wireless LAN.