multi language support for virtual...
TRANSCRIPT
가상어시스턴트를위한다국어지원
April 2020
Soporte multilenguaje para asistentes virtuales
对虚拟助手的多语言支持
Supporto multilingue per assistenti virtuali
پشتیبانی چند زبانه برای دستیاران مجازی
Prise en charge multilingue pour les assistants virtuels
Suporte em vários idiomas para assistentes virtuais
Multi Language Support for Virtual Assistants
वर्चअुल असिस्टेंट के सलए मल्टी लैंग्वेज िपोटु
仮想アシスタントの多言語サポート
Overview
Overview
• Extending the current capabilities of Almond to other languages in a cost and time efficient manner
• Avoiding template development for each new language
Goals:
Overview
• Extending the current capabilities of Almond to other languages in a cost and time efficient manner
• Avoiding template development for each new language
Goals: Solution:
Data collection strategy:
• Using neural machine translation models to produce translated sentences
• Improving translation quality using domain-dependent rules
Training strategies:
• Joint and sequential training
• Enforcing low variance on encoded outputs on same sentences from different languages
Data Collection method
Data Collection methoddisplay all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Sentence Program
English Dataset
Data Collection methoddisplay all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Sentence Program
English Dataset
Pre-Processing
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Sentence Program
English Dataset
Pre-Processing
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
muestra todas las descripciones de las reseñas creadas por " Jennifer ".
Sentence Program
English Dataset
Pre-Processing
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
muestra todas las descripciones de las reseñas creadas por " Jennifer ".
Post-Processing
Sentence Program
English Dataset
Pre-Processing
Feedback Collection&
Rule Generation
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
muestra todas las descripciones de las reseñas creadas por " Jennifer ".
Post-Processing
Sentence Program
English Dataset
Pre-Processing
Feedback Collection&
Rule Generation
- Detokenize punctuation- Replace NUMBER with actual values- Lower case all parameter values…
- Replace verbs with their imperative form- Insert missing prepositions- Replace translated parameter values with real values from target language…
Post-processing rules
Pre-processing rules
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
muestra todas las descripciones de las reseñas creadas por " Jennifer ".
Post-Processing
Sentence Program
English Dataset
Pre-Processing
Feedback Collection&
Rule Generation
- Detokenize punctuation- Replace NUMBER with actual values- Lower case all parameter values…
- Replace verbs with their imperative form- Insert missing prepositions- Replace translated parameter values with real values from target language…
Post-processing rules
Pre-processing rules
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
muestra todas las descripciones de las reseñas creadas por " Jennifer ".
Post-Processing
Sentence Program
English Dataset
Pre-Processing
Feedback Collection&
Rule Generation
- Detokenize punctuation- Replace NUMBER with actual values- Lower case all parameter values…
- Replace verbs with their imperative form- Insert missing prepositions- Replace translated parameter values with real values from target language…
Parameter MatchingPost-processing rules
Pre-processing rules
Data Collection method
Neural Machine Translation Model
(e.g. Google Translate)
display all review descriptions authored by Jennifer .
now => [description] of @restaurant.review, author == " Jennifer ") => notify
muestra todas las descripciones de las reseñas creadas por " Jennifer ".
Post-Processing
muestra todas las descripciones de las reseñas escritas por juan .
now => [description] of @restaurant.review, author == " juan ") => notify
Sentence Program
Dataset intarget language
English Dataset
Pre-Processing
Feedback Collection&
Rule Generation
- Detokenize punctuation- Replace NUMBER with actual values- Lower case all parameter values…
- Replace verbs with their imperative form- Insert missing prepositions- Replace translated parameter values with real values from target language…
Parameter MatchingPost-processing rules
Pre-processing rules
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Decoder
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Decoder
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Decoder
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Decoder Loss
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Decoder
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Decoder Loss
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Decoder
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Decoder Loss
Naive Training
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Encoder
Decoder
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Decoder Loss
We are not using the “knowledge” that these sentences are semantically equivalent
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Batching
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Batching
Encoder
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Batching
Encoder
Decoder
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
Batching
Encoder Loss
Encoder
Decoder
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Batching
Encoder Loss
Encoder
Decoder
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Decoder Loss
Batching
Encoder Loss
Encoder
Decoder
Training with sentence batching
display all review descriptions authored by Jennifer .
muestra todas las descripciones de las reseñas creadas por Jennifer .
显示Jennifer撰写的所有评论描述。
now => [description] of @restaurant.review, author == " Jennifer ") => notify
Decoder Loss
Batching
Encoder Loss
Encoder
Decoder
We now use both losses to guide the training
Experiment results (Farsi)
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Translated Verified New Params Test
Exact Match Accuracy
Challenges
Challenges
• Google translate is not perfect
• Identifying Language specific traits (single/ plural, missing prepositions, ...)
• Closing the gap between evaluation accuracy and test (real data) accuracy
• Automating and improving collection of natural parameter values for each language
• ...
Challenges
• Google translate is not perfect
• Identifying Language specific traits (single/ plural, missing prepositions, ...)
• Closing the gap between evaluation accuracy and test (real data) accuracy
• Automating and improving collection of natural parameter values for each language
• ...
Bonus:• Started code is available free of charge!
• 18/6 project technical support
• Optional happy hours to celebrate our results
• Will be featured as a contributor in our EMNLP paper