deep learning and its applications to speech

15

Deep Learning and its applications to Speech 225D - Audio Signal Processing in Humans and Machi Oriol Vinyals UC Berkeley

Upload: ganya

Post on 24-Feb-2016

67 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

Deep Learning and its applications to Speech . EE 225D - Audio Signal Processing in Humans and Machines. Oriol Vinyals UC Berkeley. Disclaimer. This is my biased view about deep learning and, more generally, machine learning past and current research!. Why this talk?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Deep Learning and its applications to Speech

Deep Learning and its applicationsto Speech

EE 225D - Audio Signal Processing in Humans and Machines

Oriol VinyalsUC Berkeley

Page 2: Deep Learning and its applications to Speech

●This is my biased view about deep learning and, more generally, machine learning past and current research!

Disclaimer

Page 3: Deep Learning and its applications to Speech

●It’s a hot topic… isn’t it?

●http://deeplearning.net

Why this talk?

http://deeplearning.net/

http://deeplearning.net/

http://deeplearning.net/

Page 4: Deep Learning and its applications to Speech

●Let x be a signal (or features in machine learning jargon), want to find a function f that maps x to an output y:●Waveform “x” to sentence “y” (ASR)●Image “x” to face detection “y” (CV)●Weather measurements “x” to forecast “y” (…)

●Machine learning approach:●Get as many (x,y) pairs as possible, and find f

minimizing some loss over the training pairs●Supervised●Unsupervised

Let’s step back to a ML formulation

Page 5: Deep Learning and its applications to Speech

(slide credit: Eric Xing, CMU)

NN

Page 6: Deep Learning and its applications to Speech

●Universal approximation thm.:●We can approximate any (continuous) function

on a compact set with a single hidden neural network

Can’t we do everything with NNs?

Page 7: Deep Learning and its applications to Speech

●It has two (possibly more) meanings:●Use many layers in a NN

●Train each layer in an unsupervised fashion

●G. Hinton (U. of T.) et al made these two ideas famous in his 2006 Science paper.

Deep Learning

Page 8: Deep Learning and its applications to Speech

2006 Science paper (G. Hinton et al)

Page 9: Deep Learning and its applications to Speech

Great results using Deep Learning

Page 10: Deep Learning and its applications to Speech

Deep Learning in Speech

Featureextraction

Phoneprobabilities HMM

Page 11: Deep Learning and its applications to Speech

●Small scale (TIMIT)●Many papers, most recent: [Deng et al, Interspeech11]

●Small scale (Aurora)●50% rel. impr. [Vinyals et al, ICASSP11/12]

●~Med/Lg scale (Switchboard)●30% rel. impr. [Seide et al, Interspeech11]

●… more to come

Some interesting ASR results

Page 12: Deep Learning and its applications to Speech

●Model strength vs. generalization error●Deep architectures: more parameters more

efficiently… Why?

Why is deep better?

Page 13: Deep Learning and its applications to Speech

●Most relevant work by B. Olshausen (1997!)

“Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V1?”●Take a bunch of random natural images,

do unsupervised learning, you recover filters that look exactly the same as V1!

Is this how the brain really works?

Page 14: Deep Learning and its applications to Speech

●People knew about NN for very long, why the hype now?●Computational power?●More data available?●Connection with neuroscience?

●Can we computationally emulate a brain?●~10^11 neurons, ~10^15 connections

●Biggest NN: ~10^4 neurons, ~10^8 connections

●Many connections flow backwards●Brain understanding is far from complete

Criticisms/open questions

Page 15: Deep Learning and its applications to Speech

Questions?

Deep Learning for Speech and Language Yoshua Bengio, U. Montreal NIPS’2009 Workshop on Deep Learning for Speech Recognition and Related Applications December

Monolithic 3D IC Designs for Low-Power Deep Neural ...Deep neural networks (DNNs) have become ubiquitous in many machine learning applications, from speech recognition and natural

Deep learning for automatic speech recognition - … · Deep learning for automatic speech recognition ... speech synthesis, translation, information retrieval from audio and video

Houdini: Fooling Deep Structured Visual and Speech ...u.cs.biu.ac.il/~jkeshet/papers/CisseAdNeKe17.pdf · successfully apply Houdini to a range of applications such as speech recognition,

A Deep Learning Approach for Generalized Speech …ttic.uchicago.edu/~taehwan/taylor_etal_siggraph2017.pdfA Deep Learning Approach for Generalized Speech Animation SARAH TAYLOR, University

Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)

Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech and Language UPC 2017)

Speech Acts Applications Sts

A Deep Learning Approach to Identifying Shock Locations in ... · Deep learning has been successfully used in a variety of applications, including image analysis and speech recognition,

Deep neural network for speech synthesis - SSPNET · Deep neural network for speech synthesis Heng Lu University of Edinburgh 23 May 2013

Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley

Deep Speech: Scaling up end-to-end speech recognition · 2014-12-23 · Deep Speech: Scaling up end-to-end speech recognition Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro,

Speech Emotion Recognition Using Deep Neural Network ......Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds Kun-Yi Huang, Chung-Hsien

Deep Speech: Scaling up end-to-end deep learning for speech

Deep Learning for Audio - University Of Illinoisslazebni.cs.illinois.edu/spring17/lec26_audio.pdf · Possible applications in music production software. Outline Automatic Speech Recognition

Deep Learning Based Target Cancellation for Speech

Basic Deep Architectures (D1L4 Deep Learning for Speech and Language)

Automatic speech recognition system using deep learning

Deep Speech: Scaling up end-to-end speech recognition · Deep Speech: Scaling up end-to-end speech recognition Awni Hannun⇤, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos,

APPLICATIONS OF DEEP LEARNING TO SPEECH ...espace.inrs.ca/9018/1/Santos; Joao Felipe.pdfAbstract Deep neural networks (DNNs) have been successfully employed in a broad range of applications,

Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)

The Vision Behind MLPerf (mlperf.org)Deep Learning has Reinvigorated Hardware GPUs ⇒ AlexNet, Speech. TPUs ⇒ Many Google applications: AlphaGo and Translate, WaveNet speech. →

FSER: Deep Convolutional Neural Networks for Speech

Deep Learning based Speech Emotion Recognition System

Deep Learning for Speech Processing

Using Deep Learning Techniques and Inferential Speech

Deep Speech 2: End-to-End Speech Recognition in English ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/2018/01/Speech-Recognition.pdfSpeech Recognition 19/12/2017 Deep Speech 1

Deep Learning for AI - microsoft.com...Chief Scientist of AI, Microsoft Applications/Services Group (ASG) & ... Tara Sainath & Andrew Senior) Baidu’s Deep Speech 2 End-to-End DL

Speech Compression using Deep Learning

Speech Enhancement using Deep Learning

Speech Recognition Acoustic Modeling in Deep Neural Networks formozer/Teaching/syllabi/Deep... · 2015-04-01 · Deep Neural Networks for Acoustic Modeling in Speech Recognition Hinton

Speech Signals Frequency Modulation Decoding via Deep

Benchmarking and Analyzing Deep Neural Network Trainingand covers six major machine learning applications: image classiﬁcation, machine translation, speech recognition, object detection,

Deep Learning in Speech Synthesis · Deep Learning in Speech Synthesis Heiga Zen Google August 31st, 2013 ... statistical parametric speech synthesis Experiments Conclusion. Text-to-speech

Parametric Speech Synthesis (D3L5 Deep Learning for Speech and Language UPC 2017)