frequency domain coding of speech 主講人:虞台文. content introduction the short-time fourier...
DESCRIPTION
Frequency Domain Coding of Speech IntroductionTRANSCRIPT
Frequency Domain Coding of Speech
主講人:虞台文
Content Introduction The Short-Time Fourier Transform The Short-Time Discrete Fourier Transform Wide-Band Analysis/Synthesis Sub-Band Coding
Frequency Domain Coding of Speech
Introduction
Speech Coders Waveform Coders
– Attempt to reproducing the original waveform according to some fidelity criteria
– Performance: successful at producing good quality, robust speech.
Vocoders– Correlated with speech production model.– Performance: more fragile and more model depend
ent.– Lower bit rate
Frequency-Domain Coders Sub-band coder (SCB). Adaptive Transform Coding (ATC). Multi-band Excited Vocoder (MBEV). Noise Shaping in Speech Coders.
Classification of Speech Coders
Frequency Domain Coding of Speech
The Short-Time Fourier Transform
Definition of STFT
m
mjjn emxmnheX )()()(
Interpretations:Filter Bank InterpretationBlock Transform Interpretation
Filter Bank Interpretation
m
mjjn emxmnheX )()()(
is fixed at 0.
])([*)()( 00 njjn enxnheX
f (m)AnalysisFilter
Filter Bank Interpretation
...
nje 1
nje 2
nj Me 1
nj Me
)( 1jn eX
)( 2jn eX
)( 3jn eX
)( 4jn eX
h(n)
h(n)
h(n)
h(n)
x(n)
])([*)()( 00 njjn enxnheX
Filter Bank Interpretation
])([*)()( 00 njjn enxnheX
Modulation
)( 00)( jFTnj eXenx
)( jeX )(nx
nje 0
)(nx
)( 0)( jj eXeX
0
])([*)()( 00 njjn enxnheX
Filter Bank Interpretation
)( jeX )(nx
nje 0
)(nx
)( 0)( jj eXeX
0
LowpassFilter
])([*)()( 00 njjn enxnheX
Modulation
Filter Bank Interpretation])([*)()( 00 njj
n enxnheX
...
nje 1
nje 2
nj Me 1
nj Me
)( 1jn eX
)( 2jn eX
)( 3jn eX
)( 4jn eX
h(n)
h(n)
h(n)
h(n)
x(n) Modulated Subband signals
Block Transform Interpretation
m
mjjn emxmnheX )()()( 00
n is fixed at n0.
Windowed Data
AnalysisWindow
m
mjjn emxmnheX )()()(
FT of Windowed Data
)]()([)( 00nxnnhFTeX j
n
Block Transform Interpretation
n is fixed at n0. )]()([)( 00nxnnhFTeX j
n
n1
n2
n3...nr
)(1
jn eX
)(2
jn eX
)(3
jn eX
)( jn eX
r
Analysis/Synthesis Equations
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
21)(ˆSynthesis
In what condition we will have ?)(ˆ)( nxnx
Analysis/Synthesis Equations
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
21)(ˆSynthesis
deeXrnfnx njjr
r
)(21)()(ˆ )()()( nxnrhrnf
r
)()()( nrhrnfnxr
Replace r with n+r
)()()( rhrfnxr
Analysis/Synthesis Equations
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
21)(ˆSynthesis
deeXrnfnx njjr
r
)(21)()(ˆ )()()( nxnrhrnf
r
)()()( nrhrnfnxr
Therefore, )(ˆ)( nxnx if 1)()(
nhnfn
)()()( rhrfnxr
Analysis/Synthesis Equations
More general, 1)()(21)()(
deHeFnhnf jj
n
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
21)(ˆSynthesis
Therefore, )(ˆ)( nxnx if 1)()(
nhnfn
Examples1)()(
21)()(
deHeFnhnf jj
n
0)0( ,)0()()(
h
hnnf 1)()(
nhnfn
neH
nf j allfor ,)(
1)( 0
)()()( 0j
j
eHeF
1 ( ) ( ) 12
j jF e H e d
Examples0)0( ,
)0()()(
h
hnnf
r
njjr deeXrnfnx )()(
21)(ˆ
deeXh
nx njjn )(
21
)0(1)(ˆ
m
mjjn emxmnheX )()()(
h(0)x(n)
)(nx
Examples
r
njjr deeXrnfnx )()(
21)(ˆ
r
njjrj deeX
eHnx )(
21
)(1)(ˆ
0
m
mjjn emxmnheX )()()(
neH
nf j allfor ,)(
1)( 0
j
n
j enheH )()(
n
j nheH )()( 0
r
jr
r
eXFTrh
)]([)(
1 1
r
r
nxnrhrh
)()()(
1
r
r
nxrhrh
)()()(
1)(nx
Frequency Domain Coding of Speech
The Short-Time Discrete Fourier Transform
Definition of STDFT
m
kmM
Mkmjnn WmxmnheXkX )()(][)( )/2(
Analysis:
1
0
)()(1)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis: In what condition we will have?)(ˆ)( nxnx
r
njjr deeXrnfnx )()(
21)(ˆ
m
mjjn emxmnheX )()()(
)/2( MjM eW
Synthesis
1
0
)(1)()(ˆM
k
knMr
r
WkXM
rnfnx
m
kmMn WmxmnhkX )()()(
)()()()(ˆ nxnrhrnfnxr
)()()( nrhrnfnxr
1)(nx
1)()(
nrhrnfr
1
0
)()(1)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis
1
0
)(1)()(ˆM
k
knMr
r
WkXM
rnfnx
)()()()(ˆ nxnrhrnfnxr
)()()( nrhrnfnxr
)(nx
1)()(
nrhrnfr
periodic. are )()(ˆBoth nxnx
)()()(ˆ)(ˆ
MnxnxMnxnx
We need only one period.Therefore, the condition is respecified as:
)()]([)( ppMnrhrnfr
Implementation Consideration
n
Freq
uenc
yk
0Spectrogram
Sampling
n
Freq
uenc
yk
0Spectrogram
R 2R 3R 4R
)(0 kX R )(kX R )(2 kX R )(3 kX R )(4 kX R
Sampled STDFT
m
kmMn WmxmnhkX )()()(
Analysis:
1
0
)()(1)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis: In what condition we will have?)(ˆ)( nxnx
m
kmMsR WmxmsRhkX )()()(
1
0
)()(1)(ˆM
k s
knMsR WkXsRnf
Mnx
Sampled STDFT
m
kmMn WmxmnhkX )()()(
Analysis:
1
0
)()(1)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis: In what condition we will have?)(ˆ)( nxnx
m
kmMsR WmxmsRhkX )()()(
1
0
)()(1)(ˆM
k s
knMsR WkXsRnf
Mnx
)()]([)( ppMnrhrnfr
)()]([)( ppMnsRhsRnfs
Frequency Domain Coding of Speech
Wide-BandAnalysis/Synthesis
Short-Time Synthesis --- Filter Bank Summation
m
mjjn emxmnheX )()()(
m
mjjn
kk emxmnheX )()()(
STFT
h(n)x(n)
nj ke
)( kjn eX
nj kenxnh )(*)(
LowpassFilter
Short-Time Synthesis --- Filter Bank Summation
m
mjjn emxmnheX )()()(
m
nmjjn
kk emhmnxeX )()()()(
STFT
m
mjnj kk emhmnxe )()(
m
knjj
n mhmnxeeX kk )()()(nj
kkenhnh )()(
Short-Time Synthesis --- Filter Bank Summation
|H(ej)|
|Hk(ej)|
k
Lowpass filter Bandpass filter
( )( ) kjjkH e H e
m
knjj
n mhmnxeeX kk )()()(nj
kkenhnh )()(
Short-Time Synthesis --- Filter Bank Summation
hk(n)x(n))( kj
n eX
BandpassFilter nj ke
m
mjjn emxmnheX )()()(
h(n)x(n)
nj ke
)( kjn eX
LowpassFilter
Lowpass representation of for the signal in a band centered at k.
m
knjj
n mhmnxeeX kk )()()(nj
kkenhnh )()(
Short-Time Synthesis --- Filter Bank Summation
hk(n)x(n))( kj
n eX
BandpassFilter nj ke
h(n)x(n)
nj ke
)( kjn eX
LowpassFilter
nj ke
)(nyk
nj ke
)(nyk
Encoding one band Decoding one band
)(*)()()( nhnxeeXny knjj
nkkk
Short-Time Synthesis --- Filter Bank Summation
)(*)()()( nhnxeeXny knjj
nkkk
h1(n))( 1j
n eX
)(1 ny
nje 1 nje 1x(n)
nje 0
h0(n))( 0j
n eX )(0 nynje 0
hN1(n))( 1Nj
n eX
)(1 nyN
nj Ne 1 nj Ne 1
.
.
.
)(ny
Analysis Synthesis
Short-Time Synthesis --- Filter Bank Summation
h1(n))( 1j
n eX
)(1 ny
nje 1 nje 1x(n)
nje 0
h0(n))( 0j
n eX )(0 nynje 0
hN1(n))( 1Nj
n eX
)(1 nyN
nj Ne 1 nj Ne 1
.
.
.
)(ny
Analysis Synthesis
Short-Time Synthesis --- Filter Bank Summation
h1(n))( 1j
n eXnje 1 nje 1x(n)
nje 0
h0(n))( 0j
n eX )(0 nynje 0
hN1(n))( 1Nj
n eX
)(1 nyN
nj Ne 1 nj Ne 1
.
.
.
)(ny
Analysis Synthesis
)(1 ny
)()( )( kjjk eHeH
Equal Spaced Ideal Filters
N2
N2
N2
N2
N2
N2
N2
1 2 3 4 5 21 0
N = 6
)()( )( kjjk eHeH
Nk
k
2
Equal Spaced Ideal Filters
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)
x(n)
h0(n)
hN1(n)
.
.
.
1
0
)()(~ N
k
jk
j eHeH
What condition should be satisfied so that y(n)=x(n)?
)()( )( kjjk eHeH
Nk
k
2
Equal Spaced Ideal Filters
)()( )( kjjk eHeH
Nk
k
2
1
0
)(1 N
k
njj kk eeHN
r
rNnh )(
Equal spaced sampling of
H(ej )
Inverse discrete FT of H(ej )
Time-Aliasedversion of h(n)
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)()( )( kjjk eHeH
Nk
k
2
1
0
)(1 N
k
njj kk eeHN
r
rNnh )(
Consider FIR, i.e., h(n) is of duration of L samples.
0 L1 n
h(n)
In case that N L,
1
0
)0()(1 N
k
j heHN
k
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)()( )( kjjk eHeH
Nk
k
2
1( )
0
( ) ( )k
Njj
k
H e H e
1
0
( )k
Nj
k
H e
)0(Nh
1
0
)0()(1 N
k
j heHN
k
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters)0()(~ NheH j
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)
x(n)
h0(n)
hN1(n)
.
.
.
)()0()( nxNhny
0 L1 n
h(n)
x(n) can always beReconstructed if N L,
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters)0()(~ NheH j
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)
x(n)
h0(n)
hN1(n)
.
.
.
0 L1 n
h(n)
x(n) can always beReconstructed if N L,
Does x(n) can still be reconstructed if N<L?
If affirmative, what condition should be satisfied?
)()0()( nxNhny
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)
x(n)
h0(n)
hN1(n)
.
.
.
njk
kenhnh )()(
njN
k
kenhnh
1
0
)()(~
Nk
k
2
1
0
)(N
k
nj kenh
p(n)
r
rNnNnp )()(
Equal Spaced Ideal Filters
njN
k
kenhnh
1
0
)()(~
1
0
)(N
k
nj kenh
p(n)
r
rNnNnp )()(
)()()(~npnhnh
r
rNnrNhN )()(
Signal can be reconstructedIf it equals to (n m).
)()()(~ npnhnh
r
rNnnNh )()(
Typical Sequences of h(n))()()(~ npnhnh
Ideal lowpass filter with cutoff at /N.
nn
nh N
sin)(
0N2N N 2N 3N 4N
p(n)N
)()(~ nnh
0N2N N 2N 3N 4N
h(n)1/N
Typical Sequences of h(n))()()(~ npnhnh
0N2N N 2N 3N 4N
p(n)N
0N2N N 2N 3N 4N
h(n)
h(0)
)()0()(~
nNhnh
L2L L 2L 3L 4L
N L
Typical Sequences of h(n))()()(~ npnhnh
0N2N N 2N 3N 4N
p(n)N
)2()(~ Nnnh
0N2N N 2N 3N 4N
h(n)
h(0)
1/N A causalFIR lowpass filter
Typical Sequences of h(n))()()(~ npnhnh
0N2N N 2N 3N 4N
p(n)N
)()(~ Nnnh
0N2N N 2N 3N 4N
h(n)
h(0)
1/N A causalIIR lowpass filter
Filter Back Implementation for a Single Channel
hk(n)x(n))( kj
n eX
nj ke nj ke
)(nyk
h(n)x(n)
nj ke
)( kjn eX
nj ke
)(nyk
Analysis Synthesis
hk(n)x(n))( kj
n eX
nj ke nj ke
)(nyk
h(n)x(n)
nj ke
)( kjn eX
nj ke
)(nyk
Filter Back Implementation for a Single Channel
R:1
R:1
1:R
1:R)( kj
n eX
)( kjn eX
Analysis Synthesis
Decimator Interpolator
hk(n)x(n))( kj
n eX
nj ke nj ke
)(nyk
h(n)x(n)
nj ke
)( kjn eX
nj ke
)(nyk
Filter Back Implementation for a Single Channel
R:1
R:1
1:R
1:R)( kj
n eX
)( kjn eX
Analysis Synthesis
Decimator Interpolator
Depends on the bandwidth of h(n).
R=?
Frequency Domain Coding of Speech
Sub-Band Coding
Analysis Synthesis
Filter Bank Implementation(Direct Implementation)
...
0NW
h(n)
h(n)
h(n)
h(n)
x(n)n
NW
knNW
nNNW )1(
...
)0(sRXR:1
R:1
R:1
R:1
)1(sRX
)(kX sR
)1( NX sR
1:R
1:R
1:R
1:R
...
...
f(n)
f(n)
f(n)
f(n)
0NW
nNW
knNW
nNNW )1(
x(n)
Complex Channels R=2B
Bandwidth B/2
Filter Bank Implementation(Practical Implementation)
0
B
k0
B
k
0 B/2B/2 0 B/2B/2
0B 0 B
0B B
knNW kn
NW
2/jBne 2/jBne
Filter Bank Implementation(Practical Implementation)
)()()( njbnaeX kkj
nk
)()()( njbnaeX kkj
nk
...
...
h(n)
h(n)
x(n)
knNW
knNW
...2/jBne
2/jBne
)(nyk
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk
Filter Bank Implementation(Practical Implementation)
)2/cos(Bn
)(21 nyk
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk
)2/sin(Bn
)(nak
)(nbk
nkcos
nksin
...
h(n)
x(n)
...
h(n)
)(21 sDyk
)2/cos(BsD
)2/sin(BsD
)(nak
)(nbk
nkcos
nksin
...
h(n)
x(n)
...
h(n)
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk
D:1
D:1
BD /
Why?
)(sDak
)(sDbk
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk
)(21 sDyk
)2/cos(BsD
)2/sin(BsD
)(nak
)(nbk
nkcos
nksin
...
h(n)
x(n)
...
h(n)
D:1
D:1
BD /)(sDak
)(sDbk
)2/cos( s
)2/sin( s
)(21 sDyk
)(sDak
)(sDbk
)(nak
)(nbk
nkcos
nksin
...
h(n)
x(n)
...
h(n)
)2/cos( s
)2/sin( s
D:1
D:1
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk
,0,1,0,1,0,1
,1,0,1,0,1,0
s)1(
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk
s)1(
)2( Dsak
)2( Dsbk
x(n)
)(nak
)(nbk
nkcos
nksin
...
h(n)
...
h(n)
)(21 sDyk
D:1
D:1
2D:1
2D:1
Filter Bank Implementation(Practical Implementation)
ADPCMCODEC
s)1(
s)1(
)2( Dsak
)2( Dsbk
nkcos
nksin ...
h(n)
...
h(n)
2D:1
2D:1
)(nx
f(n)
...
f(n)
2D:1
2D:1
s)1(
s)1(
nkcos
nksin...
)2(ˆ Dsak
)2(ˆ Dsbk
)(ˆ nxk
Filter BankAnalysis
Sub-Band CoderModification
Filter BankSynthesis