evan macmillan, gridspace // how machines enter the conversation
TRANSCRIPT
![Page 1: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/1.jpg)
How Machines Enter The Conversation
Evan Macmillan CEO and Co-founder
![Page 2: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/2.jpg)
Product & Engineering team
![Page 3: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/3.jpg)
Downloaded the talks from Youtube
![Page 4: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/4.jpg)
250K spoken words
DataDriven talks included in sample set
![Page 5: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/5.jpg)
What past speakers covered
• Data and data sources! (860 mentions / 1MM words) – Hooray, we can “use these new data sources…”
– Can you imagine "all these different data sources…”
• Infrastructure for scale (723 mentions / 1MM words) – Are you ready for ”massive scale?”
– This is how you ”operate infrastructure…”
![Page 6: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/6.jpg)
…And what they didn’t cover
• Sales and marketing (<90 mentions / 1MM words) – These terms were used 1/10th as much as data
– Nobody was talking about competition
• Applied machine learning (12.7 mentions / 1MM words) – We are ”cleverly building machine learning algorithms…”
– We need “training data to do good machine learning…”
![Page 7: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/7.jpg)
How Machines Enter the Conversation
• Why voice data is unique
• Evolution of voice processing
• Task-oriented interfaces
• Collaborating with machines
![Page 8: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/8.jpg)
15,942 spoken words per day
161 emailed words per day
Voice data is abundant
Just averages, some folks type and talk way more…
![Page 9: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/9.jpg)
And it comes with lots of labels…
![Page 10: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/10.jpg)
The closed captioning on ‘Kathy Lee Live’
Professional transcripts
![Page 11: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/11.jpg)
These speaker ids help with contextual dictionaries
Speaker identities
![Page 12: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/12.jpg)
Really support the recognition task
Business outcomes
![Page 13: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/13.jpg)
The rise of voice processing
Evan Macmillan CEO and Co-founder
Abundant voice
recordings
Lots of labeled
voice data
Enterprise computing
capacity
But… hard processing pipelines!
![Page 14: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/14.jpg)
Satellite photo of corn field Audio of corn crop report
Output 100 corns/pixel Sentences or figures
Boundaries Most corn is in the NE field Ambiguous
Signal Clouds, optical distortions Echo, background noise
Pipeline CNN, spreadsheet Transcription, NLP, ???
Analyzing corn.jpg vs. corn.wav
![Page 15: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/15.jpg)
Voice processing recipe:
• Transcriptions (speech -> words)
• Natural language processing (words -> meaning)
• Human Software Interfaces (meaning -> users)
![Page 16: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/16.jpg)
Loud and clear progress
• Progress on ASR – Bell labs detects one speaker saying numbers in 1930s
– New statistical methods in the 1980s
– HMM -> DNN-GMM -> End-to-end DNNs (future?)
• Progress on NLP and HCI – Recalibrating the bar for ASR
– Helping companies adopt and train new systems
![Page 17: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/17.jpg)
Output representations
From Wordlens NLP visualizer
A modern ox is a couple of feet in front of the hay wall. It is cloudy. The ground is shiny grass. The huge hamburger is on the ox. An enormous gold chicken is behind the wall. feet
![Page 18: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/18.jpg)
Single voice interaction?
![Page 19: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/19.jpg)
…Or many voice interactions?
![Page 20: Evan Macmillan, Gridspace // How Machines Enter The Conversation](https://reader034.vdocuments.site/reader034/viewer/2022042722/58a64ac01a28ab6e368b5991/html5/thumbnails/20.jpg)
How Machines Enter The Conversation