crowd-sourcing platform for large-scale speech data...

18
Crowd-sourcing platform for large-scale speech data collection João Freitas, António Calado, Daniela Braga, Pedro Silva, Miguel Sales Dias

Upload: others

Post on 05-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Crowd-sourcing platform for large-scale

speech data collection

João Freitas, António Calado, Daniela Braga, Pedro Silva,

Miguel Sales Dias

Page 2: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Outline

• Motivation

• Crowd-sourcing

• System description

• Quiz Game and Personalized TTS

• Media and user feedback

• Results

• Conclusions and Future work

2

Page 3: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Motivation

• ASR systems based on statistical models require vast

amounts of speech data

• Corpora are expensive

• Databases quality issues:

– bad recording conditions

– sample rates inconsistency

– inexistent transcription

– Etc.

3

Page 4: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Previous Data Collections

4

Page 5: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

YourSpeech

• What is YourSpeech?

– Platform that aims at collecting desktop speech data at negligible

costs for any language.

– Entertainment based reward in exchange for his/her speech.

5

Page 6: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Crowd-sourcing

• Act of outsourcing tasks to a community (crowd)

• Collaborative model

• Entity publishes a problem Crowd finds the solution

• Reward

• Task characteristics:

– Hard to automate

– Vast

– Expensive

6

Page 7: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Crowd-sourcing examples

7

Page 8: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Quiz Game

8

Page 9: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Personalized TTS

9

Page 10: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

System description

10

Internet

YourSpeech Server

YourSpeech

Database

Recording

ServicesHTTPS

Handler

Handle wave files

Handle Sessions

TTS Generation Server

Recording Platform

Recording Application

ActiveX

Recording

Control

TTS Queue

Page 11: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Personalized TTS (2)

11

Page 12: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Media and User feedback

• Dissemination and advertisement are essential

• Positive feedback

• People in general liked the initiative

12

MSN Site National TV

Tech

maganazine

Tech blogs

Page 13: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Results

13

Quiz Game Personalized

TTS Total

Pure Speech

(hours) 3.87 21.4 25.27

Total audio

(hours) 11.9 48 59.9

Completed

Sessions 473 94 567

Incomplete

Sessions 205 223 428

Utterances 18300 9463 27763

Page 14: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Results (2)

14

Quiz

Game

Personalized

TTS Total

Words 2010 40119 42129

Insertions 79 46 125

Deletions 92 103 195

Substitutions 36 47 83

WER 10.3% 0.05% 1%

Page 15: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Ongoing campaigns

• “Doar a voz”:

http://www.doaravoz.com/

• YourSpeech deployment in 10

other languages

15

Page 16: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Future work

• Platform expansion to other languages

• Transcribe and annotate all the collected corpora

• Retrain existent acoustic models with the collected data

• Verify any changes in the ASR accuracy rate

• Increase the number of questions available in the quiz

• Improve UX

• Create content-specific games

– Focus on certain groups of words (e.g. city names, numbers,

etc.) in order to have acoustic models specialized in specific

grammar types

16

Page 17: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Conclusions

• Crowd-sourcing can be used to expand speech

resources at negligible costs

• Motivation (reward) and good dissemination are

essential

• Media and users “snowball” effect

• Games can be used to lure users

• Personalized TTS acts as a “qui pro quo” service for

speech technology

• Positive results (1% total WER)

17

Page 18: Crowd-sourcing platform for large-scale speech data collectiondownload.microsoft.com/download/A/0/B/A0B1A66A-5EBF-4CF3-945… · Crowd-sourcing platform for large-scale speech data

Thank you very much for your attencion!

Crowd-sourcing platform for large-

scale speech data collection

www.microsoft.com/portugal/mldc

Questions?

FALA 2010

12th November 2010, Vigo, Spain

18

PT-pt YourSpeech: www.pt.yourspeech.net