presentation 5 : sound propagation in the human...
TRANSCRIPT
Sound propagation
The vocal tract is modeled as a tube of non-uniform, time-
varying, cross-section area
Speech corresponds to variations of air in such a system
Needs a complete specification of A(x,t)
Uniform tube model
First study uniform lossless tube model that is A(x,t)
is constant in both t and x
Then add
simple models of losses due to soft walls, effects of friction
and thermal conduction
model for radiation at lips
source model at glottis
nasal model for nasal tract, there are two branches as
shown in the figure
Lossless tube model
A widely used model based upon the assumption that the vocal
tract can be represented as a concatenation of variable length,
constant cross sectional area lossless acoustic tubes
The model can deviate due to the lossess
Friction at the walls
Heat conduction through the walls
Vibration at the walls of the tube
Each loss can be studied for more detailed but more complicated
model
Losless tube model
A large number of tubes with short length can reasonably
approximate the vocal tract
Digital models for speech production
Vocal Tract Model, 𝑉(𝑧)
It can be approximated with an all-pole model for majority of sounds
Nasals and fricatives require both poles and zeros
We may include zeros in the transfer function or
We may introduce more poles, the effect of a zero can be approximated by including more poles
Roots of V(z);
For stability, all poles are inside the unit circle
𝑉 𝑧 =𝐺
1 − 𝛼𝑘𝑧−𝑘𝑁
𝑘=1
Digital models for speech production
Radiation Model, 𝑅(𝑧)
Can be approximated with a zero slightly inside the unit circle, 𝛼 < 1
𝑅 𝑧 = 1 − 𝛼𝑧−1
Digital models for speech production
Excitation Model;
For unvoiced sounds excitation can be modeled as a white noise + a gain
parameter to control the intensity
For voice sounds, remember the glottal airflow from the glottis
Glottal Pulse Model, 𝐺(𝑧)
Digital models for speech production
Frequency response curves for various components of the
speech model and the resulting waveform
Digital models for speech production
Another model for speech production adapted from ‘Discrete Time
Speech Signal Processing : Principles and Practice’, Thomas F.
Quatieri
Impulse
Train
Random
Noise
Impulsive
Input
V(z)
G(z) X
Linear/Non-linear
Combiner
Av
An
Ai
X
X R(z)
Speech
0 1 2 3 4 5 6 7 8
x 104
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5