lec04, speech ii, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... ·...
TRANSCRIPT
![Page 1: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/1.jpg)
Multimedia SystemsMultimedia Systems
Speech IISpeech II
Course PresentationCourse Presentation
Mahdi Amiri
February 2013
Sharif University of Technology
![Page 2: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/2.jpg)
Speech Compression
Based on Time Domain analysis
Differential Pulse-Code Modulation (DPCM)
Adaptive DPCM (ADPCM)
Road MapRoad Map
Page 1 Multimedia Systems, Speech II
Based on Frequency Domain analysis
Linear Predictive Coding (LPC)
Code Excited Linear Prediction (CELP)
![Page 3: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/3.jpg)
Differential PCM (DPCM)IdeaIdea
Take advantage of data redundancy
Page 2 Multimedia Systems, Speech II
[… 110 112 111 112 112 114 115 115 114 114… ] [… +2 -1 +1 0 +2 +1 0 -1 0 …]
Or histogram of PCM samples in a chunk
of digitized audio.
![Page 4: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/4.jpg)
Differential PCM (DPCM)Basic SchemeBasic Scheme
General Predictive Coding
Page 3 Multimedia Systems, Speech II
1Delta Modulation (DM): i n ia x z
−
−⇒∑
Problem?
![Page 5: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/5.jpg)
Differential PCM (DPCM)Error PropagationError Propagation
General Predictive Coding
Page 4 Multimedia Systems, Speech II
The output of dequantizer in decoder is not equal with the input of the
quantizer in the encoder � The input of predictor in decoder is not the
same as input values of predictor in encoder � This is the source of error
propagation.
![Page 6: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/6.jpg)
Differential PCM (DPCM)Better StructureBetter Structure
Page 5 Multimedia Systems, Speech II
![Page 7: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/7.jpg)
Adaptive DPCM (ADPCM)IdeaIdea
Page 6 Multimedia Systems, Speech II
Problem?
![Page 8: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/8.jpg)
Adaptive DPCM (ADPCM)Size of Quantization StepSize of Quantization Step
Delta Modulation (DM)
1 bit quantizer: 0 means + and 1 means ∆ −∆
Page 7 Multimedia Systems, Speech II
ADM: [ ] [ 1]n M n∆ = ∆ −
12, 2
P Q= =
1 if [ ] [ 1]
1 if [ ] [ 1]
M P c n c n
M Q c n c n
= > = −
= < ≠ −
Adaptive Delta Modulation (ADM)
![Page 9: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/9.jpg)
Speech Compression ConceptsFFT, No Time LocalizationFFT, No Time Localization
Speech Signal
Page 8 Multimedia Systems, Speech II
FFT
(is only localized in frequency)
Joseph Fourier, 1768-1830
![Page 10: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/10.jpg)
Speech Compression ConceptsFFT, No Time LocalizationFFT, No Time Localization
Page 9 Multimedia Systems, Speech II
See Power Spectral Density (PSD) examples in MATLAB
![Page 11: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/11.jpg)
Speech Compression ConceptsSTFTSTFT
Speech Signal
Page 10 Multimedia Systems, Speech II
STFT
(fixed time and frequency localization)
Dennis Gabor, 1900-1979
![Page 12: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/12.jpg)
Speech Compression ConceptsSpectrogramSpectrogram
Page 11 Multimedia Systems, Speech II
3D surface spectrogram of a part
from a music piece.
![Page 13: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/13.jpg)
Speech Compression ConceptsSpectrogramSpectrogram
Page 12 Multimedia Systems, Speech II
Spectrogram of a male voice saying ‘nineteenth century’.
![Page 14: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/14.jpg)
Speech Compression ConceptsSpectrogram Display in Spectrogram Display in AudaCityAudaCity
Page 13 Multimedia Systems, Speech II
Waveform
Spectrogram
![Page 15: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/15.jpg)
Speech Compression ConceptsSpectrogram Display in Spectrogram Display in AudaCityAudaCity
AudaCity | Edit | Preferences |
Spectrograms | FFT Window |
Window size
Page 14 Multimedia Systems, Speech II
FFT Window size:128
FFT Window size:1024
![Page 16: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/16.jpg)
Speech Compression ConceptsSpectrogram, DemonstrationSpectrogram, Demonstration
Bat Echolocation Call Flute by Jean Pierre Rampal
Page 15 Multimedia Systems, Speech II
Bat Echolocation Call Flute by Jean Pierre Rampal
Singing Voice Face!
![Page 17: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/17.jpg)
Speech Compression ConceptsFormantFormant
Page 16 Multimedia Systems, Speech II
The time and frequency domain
presentation of vowels /a/, /i/, and /u//a/
/i/
/u/
![Page 18: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/18.jpg)
Speech Compression ConceptsSample ApplicationSample Application
A computing system to answer
questions posed in natural language
Page 17 Multimedia Systems, Speech II
Jeopardy! champions Ken Jennings (left) and Brad Rutter (right) versus the IBM computer Watson
www-943.ibm.com/innovation/us/watson/
Dr. David Ferrucci, Watson Principal Investigator
![Page 19: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/19.jpg)
Linear Predictive Coding (LPC)ModelingModeling
Page 18 Multimedia Systems, Speech II
![Page 20: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/20.jpg)
Linear Predictive Coding (LPC)Modeling (Hiss or Buzz)Modeling (Hiss or Buzz)
Buzzer � Filter
Speech = Formants + Residue
Chuncks: 30 thr. 50 frames/sec.
Page 19 Multimedia Systems, Speech II
1
[ ] [ ]P
i
i
x n a x n i=
= −∑ɶPredictor for each frame:
Speech = Formants + Residue
![Page 21: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/21.jpg)
Linear Predictive Coding (LPC)Modeling (Hiss or Buzz)Modeling (Hiss or Buzz)
The human vocal tract as an infinite impulse response (IIR) system Vowel /a/
Page 20 Multimedia Systems, Speech II
LPC Block Diagram
![Page 22: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/22.jpg)
Linear Predictive Coding (LPC)Original Paper, Original Paper, AtalAtal--HanauerHanauer 19711971
Original
Page 21 Multimedia Systems, Speech II
Comparison of wide-band sound spectrograms for synthetic and original speech signal for the utterance "It's
time we rounded up that herd of Asian cattle," spoken by a male speaker
Original
Synthetic
![Page 23: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/23.jpg)
Linear Predictive Coding (LPC)Voiced Frame ExampleVoiced Frame Example
Original
Page 22 Multimedia Systems, Speech II
Synthetic
Time Domain Frequency Domain
180 samples, Pitch period: 75
![Page 24: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/24.jpg)
Linear Predictive Coding (LPC)Unvoiced Frame ExampleUnvoiced Frame Example
Original
Page 23 Multimedia Systems, Speech II
Synthetic:
White noise
with uniform
distribution
Time Domain Frequency Domain
180 samples
![Page 25: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/25.jpg)
Code Excited Linear Prediction
Problem of LPCWhere there is both Hiss and Buzz
Solution
CELPCELP
Encoder
Page 24 Multimedia Systems, Speech II
SolutionEncode residue
MethodVector Quantization
(Codebook)Decoder
![Page 26: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/26.jpg)
Vector QuantizationBlock DiagramBlock Diagram
Page 25 Multimedia Systems, Speech II
![Page 27: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/27.jpg)
Vector QuantizationExampleExample
Sample scalar quantizer
We have 3 possible colors for
each square; so we can quantize
each square with 2 bits � (28 *
2 = 56 bits for all 28 (7*4)
squares.
Page 26 Multimedia Systems, Speech II
squares.
Sample vector quantizer
We have 8 forms in the
codebook; so we can quantize
each form with 3 bits � (7 * 3
= 21 bits for all 28 (7*4)
squares.Codebook
![Page 28: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/28.jpg)
Vector QuantizationCodebook DesignCodebook Design
Page 27 Multimedia Systems, Speech II
![Page 29: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/29.jpg)
Comparison of Speech CodersSample SpeechSample Speech
A lathe is a big tool. Grab every dish of sugar.
Page 28 Multimedia Systems, Speech II
![Page 30: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/30.jpg)
Comparison of Speech CodersDemonstrationDemonstration
Page 29 Multimedia Systems, Speech II
Original ADPCM
LPC CELP
![Page 31: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/31.jpg)
Speech Coding
G.711
PCM
u-law, a-law
64, 80 and 96 kbps
G.722
ITUITU--T StandardsT Standards
Check out a complete list athttp://en.wikipedia.org/wiki/List_of_codecs#Audio_codecs
A comparison of Internet audio compression formats
http://www.sericyb.com.au/audio.html
Page 30 Multimedia Systems, Speech II
G.722
ADPCM
48, 56 and 64 kbps
G.728
A form of CELP
16 kbps
Vocoders
http://www.sericyb.com.au/audio.html
![Page 32: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/32.jpg)
Speech Coding
HawkVoice
Free and Open Source CodeFree and Open Source Code
http://hawksoft.com/hawkvoice/
Page 31 Multimedia Systems, Speech II
Check out voice samples of HawkVoice™ codecs at
http://hawksoft.com/hawkvoice/codecs.shtml
![Page 33: Lec04, Speech II, v1.06.ppt - ce.sharif.educe.sharif.edu/courses/91-92/2/ce342-1/resources... · Adaptive Delta Modulation (ADM) ... Speech Signal Page 8 Multimedia Systems, Speech](https://reader033.vdocuments.site/reader033/viewer/2022053017/5f1c3441e94d5540aa187e13/html5/thumbnails/33.jpg)
Thank You
Multimedia SystemsMultimedia Systems
Speech IISpeech II
Page 32 Multimedia Systems, Speech II
Thank You
1. http://ce.sharif.edu/~m_amiri/
2. http://www.dml.ir/
FIND OUT MORE AT...
Next Session: Entropy CodingNext Session: Entropy Coding