application of laplace transformation (cuts topic)

1

THISIS

TOPIC

Application of Laplace Transformation

in the Vocal Tract Modeling

SUBMITTED TO:

SirTahir Mushtaq Qurashi Sb

SUBMITTED BY:

Muhammad Faisal Ejaz

NCBA&E

MULTAN CAMPUS

2

Abstract

The Laplace transform is a widely used Integral Transform in mathematics and

electrical engineering named after Pierre-Simon Laplace that transforms a function

of time into a function of complex frequency. The inverse Laplace transform takes a

complex frequency domain function and yields a function defined in the time

domain. The Laplace transform is related to the Fourier transform, but whereas the

Fourier transform expresses a function or signal as a superposition of sinusoids, the

Laplace transform expresses a function, more generally, as a superposition of

Moments. Given a simple mathematical or functional description of an input or

output to a system, the Laplace transform provides an alternative functional

description that often simplifies the process of analyzing the behavior of the system,

or in synthesizing a new system based on a set of specifications. So, for example,

Laplace transformation from the time domain to the frequency domain transforms

differential equations into algebraic equations and convolution into multiplication.

This topic considers a generalized acoustic tube model of the vocal tract, related

it to the pole-zero type linear prediction .The generalization done by vocal tract

model. The transform function is obtained from the generalized model by

conglomerating one of the three branching to the branch section at the junction

of three branches .It is also discuss how to find coefficient for the pole-zero type

linear prediction from the voiced sounds . Also discussed is how to evaluate the

reflection coefficients by connecting the pole-zero type linear prediction

algorithms to the transfer function of the generalized model.

http://en.wikipedia.org/wiki/Integral_transform

http://en.wikipedia.org/wiki/Mathematics

http://en.wikipedia.org/wiki/Electrical_engineering

http://en.wikipedia.org/wiki/Pierre-Simon_Laplace

http://en.wikipedia.org/wiki/Fourier_transform

http://en.wikipedia.org/wiki/Moment_%28mathematics%29

3

CHAPTER #1

Introduction:

The Laplace transform is named after mathematician and astronomer Pierre-Simon

Laplace, who used a similar transform (now called z transform) in his work on

probability theory. The current widespread use of the transform came about soon

after World War II although it had been used in the 19th century by Abel, Larch,

Heaviside, and Bromwich.

What is Laplace Transformation?

The Laplace transform is a widely used integral transform in mathematics and

electrical engineering named after Pierre-Simon Laplace that transforms a function

of time into a function of complex frequency.

What Does the Laplace Transform Do?

The main idea behind the Laplace Transformation is that we can solve an equation

(or system of equations) containing differential and integral terms by transforming

the equation in "t-space" to one in "s-space". This makes the problem much easier to

solve. The kinds of problems where the Laplace Transform is invaluable occur in

electronics. You can take a sneak preview in the Applications of Laplace section.

Definition of the Transform:

The Laplace transform converts a function of real variable f (t) into a function of

complex variable F(s).The Laplace transform is defined as



http://en.wikipedia.org/wiki/Z_transform

http://en.wikipedia.org/wiki/Probability_theory

http://en.wikipedia.org/wiki/Niels_Henrik_Abel

http://en.wikipedia.org/wiki/Mathias_Lerch

http://en.wikipedia.org/wiki/Oliver_Heaviside

http://en.wikipedia.org/wiki/Thomas_John_I%27Anson_Bromwich

http://www.intmath.com/laplace-transformation/10-applications.php

4

The variable s is a complex variable that is commonly known as the Laplace

operator.

OR

Starting with a given function of t, we can define a new function the

variable s.

This new function will have several properties which will turn out to be convenient

for purposes of solving linear constant coefficient ODE’s and PDE’s.

The definition of is as follows:

Definition:

Let be defined for t 0 and let the Laplace transform of be defined by,

For example:

The Laplace transform is defined for all functions of exponential type. That is, any

function

.which is

(a) Piecewise continuous = has at most finitely many finite jump discontinuities on any

interval of finite length.

(b) Has exponential growth: for some positive constants M and k.

5

History:

The Laplace transform is named after mathematician and astronomer Pierre-Simon

Laplace, who used a similar transform (now called z transform) in his work on

probability theory. The current widespread use of the transform came about soon

after World War II although it had been used in the 19th century by Abel, Larch,

Heaviside, and Bromwich. Leonhard Euler investigated integrals of the form

And

As solutions of differential equations but did not pursue the matter very far. Joseph

Louis Lagrange was an admirer of Euler and, in his work on integrating probability

density functions, investigated expressions of the form

Which some modern historians have interpreted within modern Laplace transform

theory.

These types of integrals seem first to have attracted Laplace's attention in 1782

where he was following in the spirit of Euler in using the integrals themselves as

solutions of equations. However, in 1785, Laplace took the critical step forward

when, rather than just looking for a solution in the form of an integral, he started to

apply the transforms in the sense that was later to become popular. He used an

integral of the form:



http://en.wikipedia.org/wiki/Z_transform

http://en.wikipedia.org/wiki/Probability_theory

http://en.wikipedia.org/wiki/Niels_Henrik_Abel

http://en.wikipedia.org/wiki/Mathias_Lerch

http://en.wikipedia.org/wiki/Oliver_Heaviside

http://en.wikipedia.org/wiki/Thomas_John_I%27Anson_Bromwich

http://en.wikipedia.org/wiki/Leonhard_Euler

http://en.wikipedia.org/wiki/Joseph_Louis_Lagrange

http://en.wikipedia.org/wiki/Joseph_Louis_Lagrange

http://en.wikipedia.org/wiki/Probability_density_function

http://en.wikipedia.org/wiki/Probability_density_function

6

akin to a Mellin transform, to transform the whole of a difference equation, in order

to look for solutions of the transformed equation. He then went on to apply the

Laplace transform in the same way and started to derive some of its properties,

beginning to appreciate its potential power.

Laplace also recognized that Joseph Fourier's method of Fourier series for solving

the diffusion equation could only apply to a limited region of space as the solutions

were periodic. In 1809, Laplace applied his transform to find solutions that diffused

indefinitely in space.

Formal definition

The Laplace transform is a frequency domain approach for continuous time signals

irrespective of whether the system is stable or unstable. Laplace transform approach

is also known as S-domain approach. The Laplace transform of a function f (t),

defined for all real numbers t ≥ 0, is the function F(s), which is a unilateral transform

defined by:

The parameter s is the complex number frequency:

with real numbers and ω.

http://en.wikipedia.org/wiki/Mellin_transform

http://en.wikipedia.org/wiki/Difference_equation

http://en.wikipedia.org/wiki/Joseph_Fourier

http://en.wikipedia.org/wiki/Fourier_series

http://en.wikipedia.org/wiki/Diffusion_equation

http://en.wikipedia.org/wiki/Function_%28mathematics%29

http://en.wikipedia.org/wiki/Real_number

http://en.wikipedia.org/wiki/Complex_number

7

Other notations for the Laplace transform include or alternatively

instead of F.

The meaning of the integral depends on types of functions of interest. A necessary

condition for existence of the integral is that f must be locally integrable on [0, ∞).

For locally integrable functions that decay at infinity or are of exponential type, the

integral can be understood as a (proper) Lebesgue integral. However, for many

applications it is necessary to regard it as a conditionally convergent improper

integral at ∞. Still more generally, the integral can be understood in a weak sense,

and this is dealt with below.

One can define the Laplace transform of a finite Boral measure μ by the Lebesgue

integral

An important special case is where μ is a probability measure or, even more

specifically, the Dirac delta function. In operational calculus, the Laplace transform

of a measure is often treated as though the measure came from a distribution

function f. In that case, to avoid potential confusion, one often writes

Where the lower limit of 0− is shorthand notation for

http://en.wikipedia.org/wiki/Locally_integrable

http://en.wikipedia.org/wiki/Exponential_type

http://en.wikipedia.org/wiki/Lebesgue_integral

http://en.wikipedia.org/wiki/Conditionally_convergent

http://en.wikipedia.org/wiki/Improper_integral

http://en.wikipedia.org/wiki/Improper_integral

http://en.wikipedia.org/wiki/Distribution_%28mathematics%29

http://en.wikipedia.org/wiki/Borel_measure



http://en.wikipedia.org/wiki/Probability_measure

http://en.wikipedia.org/wiki/Dirac_delta_function

http://en.wikipedia.org/wiki/Operational_calculus

http://en.wikipedia.org/wiki/Distribution_function

http://en.wikipedia.org/wiki/Distribution_function

8

This limit emphasizes that any point mass located at 0 is entirely captured by the

Laplace transform. Although with the Lebesgue integral, it is not necessary to take

such a limit, it does appear more naturally in connection with the Laplace–Stieltjes

transform.

Bilateral Laplace transform (Two-sided Laplace Transform):

When one says "the Laplace transform" without qualification, the unilateral or one-

sided transform is normally intended. The Laplace transform can be alternatively

defined as the bilateral Laplace transform or two-sided Laplace transform by

extending the limits of integration to be the entire real axis. If that is done the

common unilateral transform simply becomes a special case of the bilateral

transform where the definition of the function being transformed is multiplied by the

Heaviside step function.

The bilateral Laplace transform is defined as follows:

Properties of the Laplace Transform

The Laplace transform has the following General properties:

1. Linearity:

2. Homogeneity:


http://en.wikipedia.org/wiki/Laplace%E2%80%93Stieltjes_transform

http://en.wikipedia.org/wiki/Laplace%E2%80%93Stieltjes_transform

http://en.wikipedia.org/wiki/Two-sided_Laplace_transform

http://en.wikipedia.org/wiki/Two-sided_Laplace_transform

http://en.wikipedia.org/wiki/Heaviside_step_function

9

3. Transform of the Derivative:

4. Derivative of the Transform:

5. Some Special Transforms:

There are some transform pairs that are useful in solving problems involving the heat

equation .The derivations are given in an appendix.

The Linear Property:

Let and be functions whose Laplace transforms exist for s > and s >

respectively. Then, for s > max { , } and and any constants.

This means that the Laplace transform is a linear operator.

10

Example:

1)

2)

=

LAPLACE TRANSFORMATION PROPERTIES

By building up some basic properties of the Laplace transform, we can expand the

list of functions .we know the transform of, thus increasing the number of IVP’s we

can solve by this method.

Property 1:

In words, multiplying by -x in our usual function space is the same as differentiation

in transform space.

Example 1:

Find a function whose Laplace transform is

Solution:

We have that

Furthermore, we know that

So by Property 1 we have

11

Property 2:

In words, multiplying by in our usual function space is the same as translation to

the right by as in transform space.

Example 3:

Find a function whose Laplace transform is

Solution:

From an example in the text, we have

To make the given function look more like this one (and avoid using partial

fractions) we

Can complete the square to get

=

Now we’ve completed the square.

=

To get in the Numerator.

By property 2.

12

By the linearity of L.

Thus the given function

Is the Laplace transform of

Some Additional Examples

In addition to the Fourier transform and Eigen function expansions, it is sometimes

convenient to have the use of the Laplace transform for solving certain problems in

partial differential equations. We will quickly develop a few properties of the

Laplace transform and use them in solving some example problems.

Additional Properties of the Transform:

Let be a function of exponential type and suppose that for some b > 0,

Then is just the function , delayed by the amount b .Then

Let z = t - b so that

If we define

Then

13

And we find

Transform of a Delay:

A related results is the following

Delay of a Transform:

These result (Transform of a Delay) and (Delay of a Transform) assert that a

delay in the function induces an exponential multiplier in the transform and,

conversely, a delay in the transform is associated with an exponential multiplier for

the function.

A final property of the Laplace transform asserts that

Inverse of a Product:

Where

The product is called the convolution product of f and g. Life would be

simpler if the inverse Laplace transform of was the point wise

product , but it isn’t, it is the convolution product. The convolution product

has some of the same properties as the point wise product, namely

And

14

We will not give the proof of the result 7 but will make use of it nevertheless.

Chapter #2

Applications in Electronics

(Circuit Equations)

There are two (related) approaches:

1. Derive the circuit (differential) equations in the time domain, then transform these

ODEs to the s-domain;

2. Transform the circuit to the s-domain, and then derive the circuit equations in the s-

domain (using the concept of "impedance").

We will use the first approach. We will derive the system equations(s) in the t-plane,

and then transform the equations to the s-plane. We will usually then transform back

to the t-plane.

Example 1:

Consider the circuit when the switch is closed at t=0, VC(0) =1.0 V. Solve for the

current i (t) in the circuit.

15

Answer:

Multiplying throughout by 10-6 gives:

Now in this example, we are told

So

That is:

16

Therefore:

Collecting I terms and subtracting

from both sides:

Multiply throughout by s:

Solve for I:

Finding the inverse Laplace transform gives us the current at time t:

17

Example 2

In the circuit shown below, the capacitor is uncharged at time t = 0. If the switch is

then closed, find the currents i1 and i2, and the charge on C at time t greater than

zero.

Answer

We could either:

Set up the equations, take Laplace of each, then solve simultaneously

18

Set up the equations, solve simultaneously, and then take Laplace.

For the first loop, we have:

Divide by 5 on both sides

For the second loop, we have:

Dividing 5 on both sides

Substituting (2) into (1) gives:

Simplifying:

19

Multiply throughout by 5:

Next we take the Laplace Transform of both sides.

Note:

In this example,

So,

Now taking Inverse Laplace:

20

And using result (2) from above, we have:

For charge on the capacitor, we first need voltage across the capacitor:

So, since

, we have:

Graph of

21

Example 3

A rectangular pulse vR(t) is applied to the RC circuit shown. Find the response, v (t).

Graph of vR(t):

Note: for all t < 0 s implies v (0–) = 0 V. (We'll use this in the solution.

It means we take , the voltage right up until the current is turned on, to be zero.)

Answer:

Now

22

To solve this, we need to work in voltages, not current.

We start with

The voltage across a capacitor is given by

It follows that

So for this example we have:

Substituting known values:

Then

Taking Laplace Transform of both sides:

Since , we have:

23

So, taking inverse Laplace

NOTE: For the part:

We use:

So we have:

24

Solution Using Scientific Notebook

1. To find the Inverse Laplace:

2. To solve the original DE:

Exact solution for v (t):

To see what this means, we could write it as follows:

To get an even better idea what our expression for means, we graph it as

follows:

25

Chapter No.3:Application in Acoustics

INTRODUCTION:

The linear prediction, which had been widely used as a tool for speech signal

recognition, speaker recognition and speech synthesis, is closely related with the

acoustic modeling of the vocal tract. In fact, it is possible to drive the all-pole type

linear prediction algorithm directly from the acoustic tube modeling on the oral

cavity and the main vocal tract [H.Wakita, Direct estimation of the vocal tract shape

by inverse filtering of acoustic speech waveform , “IEEE Trans. AU, vol-21 , pp.

417-427, 1973.]. The mismatched spectral shaping or the marginal performance of

the all-pole linear prediction may be therefore regarded as stemming from the

imperfection of the corresponding vocal tract tube model , s point of views. The

most significant imperfection of the existing vocal tract modeling is in that it leaves

out the nasal cavity, thus sacrificing the effects of the nasal sounds (refer to [3] for

schematic diagrams of the vocal tract). Therefore it is of importance to generalize the

26

existing vocal tract model to include the nasal cavity as well as the oral cavity. The

resulting linear prediction counterpart will than become a pole-zero type.

In this topic, we present a generalized acoustic tube model of the vocal tract which

consists of the oral cavity, the nasal cavity and the main vocal tract. The generalized

new model will thus consist of three branches. The main difference b/w the existing

two-branch model and the new model lie in the branch section at which the three

branches meet.

The transfer function obtained from the proposed model will be than used for the

formulation of a pole-zero type linear prediction algorithms. The prediction

coefficients in both the denominator and the numerator as well as the reflection

coefficients of the generalized model will finally be evaluated by analyzing the

voiced and the nasal sounds.

Generalized tube modeling of vocal tract:

The generalized model we consider in this topic consists of three branches

corresponding to the main vocal tract, the oral cavity and the nasal cavity as shown

as figure. (3.27a).

27

(Rabiner & Schafer, Fig. 3.27a, p. 78)

When sectionalizing the branches, we obtain four different type of section: the

glottis section, the radiation section, the mid section and the branch section. The first

three section are essentially the same as for the existing two branch model, and are

well established in the literature (see for example [J. D .Markel and A.H.Grey, linear

prediction of speech, Springer - Verlag, New York, 1976.])

The fourth section, the branches section, represents the junction where the three

branches meet, and is thus unique to the generalized vocal tract model.

Modeling of the Branch Section:

we assume, as is indicated in diagram that the main vocal tract branch consists of L

section, section M through section M+N-1; the oral cavity branch consists of M

section, section 0 through section M-1 ; and the nasal cavity branch consists of N

sections; section 0 through section N-1. For convenience, we differentiate the oral

and the nasal cavity branches by superscripting “n” on the notation for the nasal

cavity branch whenever necessary.

28

We assume that the cross section area is constant over each section, indicating the

area of the section by .So, at the branch section, three sections of area

OR ( meet.

We denote by and respectively the volume velocity and the

pressure at time t at a point in the section. Solve the momentum equation and

the continuity of mass equation [L.E. Kinsler and A.R. Frey, Fundamental of

acoustics, John Wiley & Sons, New York, 1982.] for the section, we obtain

Where c denotes the speed of sound of air, the air density, and the + and – signs

denote the forward and the backward travelling components respectively. Assuming

that the length `of each section is same so that the propagation time through each

section is , then we have the following boundary conditions as the junction of the

three branches:

Apply the boundary conditions to the above solutions, we obtain

29

We define the reflection coefficient to be

(4)

Then the equations can be rewritten as

(5a)

(5b)

Or as

(6a)

(6b)

(6c)

Therefore the model for the branch section takes the form as shown in figure.

Modeling of the other sections:

The junction of the mid sections is a simplified version of the junction for the branch

section. That is, the model of the mid section “m” can be obtained by removing the

branch for “ (or by setting

to zero) with M replaced by m on equations

(6a), (6b), and (6c). Thus we obtain

(7a)

30

(7b)

Where the reflection coefficient “ ” can be expressed as

(8)

Note that equations (7a) and (7b) satisfy the form of Kelly-Lochbaum structures [J.

D .Markel and A.H.Grey, linear prediction of speech, Springer - Verlag, New York,

1976.]

For the glottis section we put an artificial matching section M+L along with the

corresponding reflection coefficient as is usually done in the literature (see [J.

D .Markel and A.H.Grey, linear prediction of speech, Springer - Verlag, New York,

1976.]). Then we obtain the relations

(9a)

(9b)

(9c)

The mathematical model for the glottis section is shown in diagram 3(b), where G

indicates the glottis. For the radiation section, we denote the radiation impedance by

(or for the nasal cavity ), That is

(10)

Where indicate the radiation point. Then we obtain

(11a)

(11b)

The mathematical model for the radiation is thus as shown in Diagram 3(c).

31

The overall pictorial representation of the generalized vocal tract model can be

obtained by combining the four types of sections Diagrams 2 and 3 back to Diagram

1(b), as is done a Diagram 3.

Transfer Function Of The New Model:

We now consider the z-domain representation of the generalized model to find the

transfer function for the vocal tract. While it is possible to get the z-domain, we

rather consider the z-domain expression from the equation of each section so that we

can obtain a more convenient expression to handle. We denote the z-domain

variables with capital letters.

For the branch section, we obtain

(12a)

(12b)

By taking the z-transform on equations (5a) and (5b) under the assumption that the

sampling period is .

In a similar manner, we obtain

(13)

For the mid section m; and

(14a)

(14b)

32

(14c)

For the glottis section; and

(15a)

(15b)

For the radiation section

In order to evaluate the transfer function H (z) { } connecting the

glottis section to one of the radiation sections, we conglomerate the other radiation

section down to the branch section. For the convenience, we assume that the sections

in the nasal cavity branch are conglomerated. Cascading all the mid sections in the

nasal cavity branch, we obtain the relation

(16)

We define and G (z) respectively by

(17a)

(17b)

Then G (z) can be evaluated by combining equations (15a), and (16) and (17). From

equations (12b) and (17b), we obtain

(18)

33

Where q=

Putting equation (18) back to (12a), we finally obtain

(19)

Where

(20a)

(20b)

(20c)

(20d)

Hence, the two branched version of the generalized vocal model takes the shape

shown in Diagram 4 in the z-domain. Notice that the existing vocal tract model can

be deduced from the model by Setting , which implies that , that is,

the nasal cavity section is not considered.

Therefore, the transfer function can be evaluated from the two-section model,

or by combining equation (13), (14), (15) and (19). The transfer function thus

obtained takes the form

(21)

Pole-Zero Type Linear Prediction:

34

As the transfer function for the generalized vocal tract model has both poles and

zero, it is necessary to consider the formulation of the pole-zero type linear

prediction method. A considerable amount of works are reported in the literature on

the pole-zero type linear prediction, but their major interest is in improved spectral

shaping (see [5]-[7]). Our main concern, however, is in considering the pole-zero

type linear prediction that can be related to the generalized vocal tract model.

Recalling that the generalized meaningful for the nasal sound and consonants, it is

necessary to make the pole-zero type algorithm compatible with those sounds. Since

the excitation of the sounds is assumed to consist of pitch pulse train and white

Gaussian noise, we must remove the effects of the pitch component from the sounds

to obtain a smoothed transfer response. It can be done by applying a homomorphism

signal processing to the sounds. The processed signal corresponds to the white

Gaussian noise response of the generalized vocal tract model, and its frequency

response corresponds to the excitation-to-sound transfer function “ ”. Let

(22a)

(22b)

(22c)

Then, we have the relation

(23)

Where denotes the white Gaussian noise input; the autocorrelation of ;

and the cross-correlation of and . Since is white Gaussian,

, for , and therefore

35

(24)

Taking p terms, through , of equation (24),

We obtain

(25)

Thus is obtained by solving equation (25). Knowing the denominator

term , we pass the signal through a filter whose system function is .

Then the resulting signal corresponds to the output of the system whose transfer

function is . If we set , then can be obtained in a similar

fashion, and B (z) can be evaluated from C (z). Given any desired orders And ,

we can therefore come up with the pole-zero type linear prediction from the excited

sounds.

References

[1] H.Wakita: direct estimation of the vocal tract shape by inverse

f er ng of a o pee h wavefor “IEEE ran for ation AU,

vol-21, pp.417-427, 1973.

[2] J.D.Markel and A.H.Grey, Linear prediction of speech, Springer-

Verlag, New York, 1976.

[3] J.L.Flangan, Speech Analysis, Synthesis and perception, Spinger-

Verlag,, New York, 1972.

36

[4] L.E.Kinsler and A.R.Frey, Fundamentals of acoustics, John Wiley

& sons, New York, 1982.

[5] K H Song and C K Un “Po e-zero modeling of speech based on

high-order po e ode f ng and de o po on e hod ” IEEE

Trans. ASSP, vol-31, pp.1556-1565, 1983.

[6] S.Marple, Jr., Digital Spectral Analysis with applications, Prentice

Hall, Englewood Cliffs, New Jersey, 1987.

[7] J Cadzow “Overe a ed ra ona ode eq a on approa h ”

Proc. IEEE, vol-70, pp.907-938, 1982

Two Dimensional Featured One Dimensional Digital Waveguide

Model for the Vocal Tract

Introduction:

A vocal tract model based on a digital waveguide is presented in which the vocal

tract has been decomposed into a number of convergent and divergent ducts. The

divergent duct is modeled by a 2D-featured 1D digital waveguide and the convergent

duct by a one dimensional waveguide. The modeling of the divergent duct is based

on splitting the volume velocity into axial and radial components. The combination

of separate modeling of the divergent and convergent ducts forms the foundation of

the current approach. The advantage of this approach is the ability to get a transfer

function in zero-pole form that eliminates the need to perform numerical calculations

on a discrete 2D mesh. In this way the present model named as a 2D-featured 1D

digital waveguide model has been found to be more efficient than the standard 2D

37

waveguide model and in very good comparison with it in the formant frequency

patterns of the vowels /a/, /e/, /i/, /o/ and /u/. The model has two control parameters,

the wall and glottal reflection coefficients that can be effectively employed for

bandwidth tuning. The model also shows its ability to generate smooth dynamic

changes in the vocal tract during the transition of vowels.

Human speech production system consists of three main components like lungs,

vocal folds and vocal tract. The coordination of these three components results into

voiced sound, unvoiced sound or combination of these two. For voiced sound

production like that of vowel, the air is pushed out from the lungs into the larynx. In

the larynx, there are two identical vocal folds which are initially closed. The closure

of the vocal folds causes a sub-glottal pressure. When this pressure rises above the

resistance of the vocal folds, the vocal folds open themselves and air is passed

through it. As the pressure decreases with the release of airflow, the vocal folds then

close themselves quickly. The quasi-periodic opening and closing of the vocal folds

continues due to constant supply of the air pressure from the lungs. Thus the

vibration of the vocal folds forms a train of periodic pulses that acts as an excitation

signal for the vocal tract. A non-uniform acoustic tube which extends from the

glottis to the lips is called a vocal tract. The position of the vocal articulators like

larynx, velum, jaw, tongue, and lips, forms a particular shape of the vocal tract. The

shape of the vocal tract modifies spectral characteristics of the quasi-periodic air

flow passing through it, which leads to the generation of voiced speech. In this way

different shapes of the vocal tract generate different voiced speeches. Several

approaches have been employed to model the voiced speech system on the basis of

physical models such as cylindrical segments (Kelly and Lochbaum, 1962; Mullen et

al., 2003) and conical segments (Välimäki and Karjalainen, 1994; Strube,

38

2003;Makarov, 2009) for the vocal tract modeling. In cylindrical approach, each

tube segment of the vocal tract is modeled by the forward- and backward-traveling

wave components of the solution of the wave equation (Morse, 1981; Smith, 1998)

known as one-dimensional waveguide model. It was firstly used in Kelly–Lochbaum

model of the human vocal tract for speech synthesis (Kelly and Lochbaum, 1962).

However, the digital waveguide modeling (DWM), which is an extension of a one-

dimensional waveguide, is recently being used in the modeling of the vocal tract

(Van Duyne and Smith, 1993a, b; Cooper et al., 2006; Mullen et al., 2006, 2007;

Speed et al., 2013).Digital waveguides are very popular for realistic and high quality

sound generation in real time, and are successfully employed in physical modeling

of sound synthesis.

The greatest advantage of a 1-D digital waveguide model is that it has complete

solution to the wave equation which is also computationally efficient for sound

synthesis applications. Moving to higher dimensions leads to a number of limitations

imposed on DWM models for an optimal solution to all sound synthesis systems.

The most important tone is the dispersion error, where the velocity of a propagating

wave depends upon both its frequency and direction of traveling, leading to wave

propagation errors and mistuning of the expected resonant modes. The dispersion

error is highly dependent upon mesh topology and has been investigated in (Van

Duyne and Smith, 1996; Fontana and Rocchesso, 2001; Campos and Howard, 2005).

Another limitation is the restriction on sampling frequency. High sampling rates

require high mesh density which corresponds to high computational cost.

A 1D waveguide model is computationally efficient while the standard 2D and 3D

waveguide models have better accuracy but heavy computational cost (Murphy and

Howard, 2000; Campos and Howard, 2000; Beeson and Murphy, 2004; Murphy et

39

al., 2007). In the present work we propose an efficient two-dimensional waveguide

model of the vocal tract that has comparable formant frequencies with the standard

2D waveguide but has efficiency comparable to that of a 1D waveguide model. In

the present model we approximate only the divergent part of the vocal tract by

divergent ducts and consider two-dimensional volume velocity in it while in the

convergent duct that represents convergent part of the vocal tract, we employ

conventional one-dimensional approximation of the volume velocity. In this way the

accuracy of the current model can never be better than the standard 2D waveguide

model which considers two-dimensional volume velocity in the whole of the vocal

tract. Therefore, we make it as a reference model for the comparison.

The present results of the formant frequencies from the numerical simulation using

area functions for specific vowels (Juszkiewicz, 2014) exhibit good comparison with

the standard 2D waveguide model. The computational cost of the standard 2D

waveguide is very high while the current approach is much more efficient. The

present section is followed by five more sections. In Section 2, we describe our

proposed vocal tract model. In this section, we also develop its mathematical

formulation. Section 3 describes how to find a transfer function of the vocal tract.

Section 4 is reserved for the numerical simulation of the model. Section 5 is

dedicated for the results and discussion and Section 6 is for the conclusions.

Vocal tract model:

We derive a new model of vocal tract with a new transfer function relating it to pole-

zero type linear prediction developed on the basis of the procedure given in (Kang

and Lee, 1988). Current approach is to propose an efficient two-dimensional

waveguide that has formant frequencies comparable with those of the standard 2D

40

waveguide. We consider the vocal tract consisting of concatenated cylindrical

acoustic tubes of same lengths but different cross-sectional areas. We define a

convergent duct by the concatenation of two cylinders, where a cylinder with larger

radius is followed by the one with the smaller radius. The connection of two

cylinders in which a narrow cylinder is followed by a wider cylinder in the direction

of flow is called a divergent duct. A serial combination of these two types of ducts

constitutes the vocal tract. For example, in Fig. 1, the concatenation of the

cylinders and forms a divergent duct while that of

Fig.1. Vocal tract decomposition into cylindrical tubes of different diameters

41

Fig.2.Model divergence duct with imaginary tube and splitting of volume

velocity

The cylinders and constitutes a convergent duct. Similarly concatenations of

with , with , with and with are labeled as divergent ducts while those of

with , with and with define convergent ducts. In the divergent duct, we

assume that the volume velocity splits into its axial and radial components as shown

in Fig. 2. The modeling of such ducts in the form of axial and radial components

may improve the formant patterns of a 1D digital waveguide which are comparable

with a 2D digital waveguide. The convergent duct may be represented by the usual

1D waveguide model as there is no 2D splitting of volume velocity at the entrance

from a wider cylinder to the narrow one. The vocal tract is divided into cylindrical

segments of same length so that the propagating time of sound wave through each

cylindrical segment in an axial direction is same, say, τ. However, each of the

uniform cylindrical segments may have a different cross-section area or diameter, so

42

that the time taken for the sound wave to propagate through a cylindrical segment in

a radial direction may not be an integer multiple of τ. In such a case, the delay in a

radial direction will necessarily be a fractional delay (Laaksoet al., 1996; Välimäki,

1995; Samadi et al., 2004). In the current model, it may be noted that in the

divergent duct reflection of wave occurs at two different places, one is where

impedance changes and the other is at the wall of the cylindrical tube. This leads to

the presence of two different types of delays in the modeling of divergent duct. The

delay in a transverse direction is formulated as the absolute difference of the radii of

the two concatenated cylindrical tubes, which will necessarily be a fractional delay

and has been approximated by the Lagrange interpolator (Laakso et al., 1996;

Välimäki, 1995; Samadi et al., 2004).For the formulation of the model, we consider

a divergent duct consisting of two cylindrical tubes of cross-sectional areas and

as shown in Fig. 2.When the volume velocity enters from the tube into the

tube, it splitsinto an axial component along the vocal tract and a

radial component in a transverse direction. We use local coordinate system in

the divergent duct. Therefore, the origin for the splitting of volume velocity into

the axial and radial directions lays at the junction of the and cylinders

as shown in Fig. 2. When the volume flow is along the direction of the vocal tract,

the acoustic impedance depends on the cross-sectional area of the cylinder. The

cross-sectional area of a cylinder is an area in which the volume flow occur normal

to this area. If we consider the volume flow in a radial direction then the volume

flow occurs normal to the surface area which leads to the assumption that the

impedance of volume flow in a radial direction may depend on the surface area of

the cylinder. For this purpose, we can assume an imaginary cylinder of appropriate

cross-sectional area intruded into the cylinder shown by dotted line in

43

Fig. 2 in which volume velocity along the vocal tract is . A transverse component

may be regarded as the volume velocity coming out of the surface of this

imaginary cylinder in a transverse direction so that it may be considered as

proportional to its surface area. In this way we can control axial and transverse

volume velocity components and by changing the radius of the imaginary

cylinder. It may be noted that the radius of the imaginary cylinder will necessarily be

a fraction of the radius of the cylinder because otherwise there can be

notransverse component in the cylinder, and may be expressed as ,

where . The surface area of this imaginary cylinder whose length is

equal to that of the cylinder, may be written as

(1)

We denote by and respectively, the volume velocity and the acoustic

pressure at position x and time t within the cylindrical tube. Then by solving the

well-known momentum equation and mass continuity equation (Markel and Gray,

1976; Rabiner and Shafer, 1978), we obtain

(2)

(3)

Where c is the velocity of sound in air, ρ is the density of air and the + and − signs

denote the forward and backward traveling components, respectively. Let l be the

length of any cylindrical tube as all tubes have same length. Under the above

assumptions, the acoustic pressure at the junction of the two cylinders forming a

divergent duct is identical in either direction and the total volume velocity is

44

preserved. We, then, have the following boundary conditions at the junction of the

and cylinders.

(4)

(5)

Where represents pressure in the transverse direction and other quantities are as

defined earlier. We have used local coordinate system in which is the

entrance location of the cylinder and is its exit location. Substituting (2)

and (3) into (4) and (5), we get

(6)

(7)

Where , is the time required to travel the cylindrical tube.

From Eq. (6), we have

(8)

(9)

Using Equations (8) and (9) in Eq. (7) we have

If we let

, then the above equation becomes

45

(10)

Which can be re-arranged to give a

(11)

In these equations is known as reflection coefficient.

Using Equations (10) and (11) in Eq. (7), then the following matrix form can be

obtained:

(12)

Now, we consider the boundary conditions at the lips and the glottis. For these cases,

we use standard approach of1D digital waveguide model (Kelly and Lochbaum,

1962).

A mathematical relation for the lips radiation is given as (Markel and Gray, 1976;

Rabiner and Shafer, 1978)

(13)

Where is the reflection coefficient at the lips?

Let , then by using Eq. (13), the output volume velocity at the lips can be

written as (Markel and Gray, 1976; Rabiner and Shafer, 1978)

(14)

Similarly, for the glottis section, we have the following mathematical relation

(Markel and Gray, 1976; Rabiner and Shafer, 1978)

46

(15)

Where is the reflection coefficient at the glottis? The time domain representation

of the present vocal tract model consists of Equations (12) to (15). However, this

representation is not computationally convenient for the study of vocal tract formant

frequencies. In the next section, we derive another representation of this model in the

z-domain using z-transformation.

Transfer function of the Model:

In this section we derive the transfer function of the above vocal tract model in pole-

zero type form by transforming the model from time-domain to z-domain using z-

transformation. This representation provides convenient means for studying the

model characteristics.

First of all we assume that we have a vocal tract model with cylindrical tubes of

equal length and delay in each tube is considered as half-sample delay, i.e., we

sample every sample, where is the time required to traverse each tube. We

denote by and

as per convention the -transformed representations of

volume velocity components and

respectively. We put in Eq.

(12) and apply z-transformation on it which leads to the following matrix form:

(16)

By applying , Eq. (6) takes the form

Where

(17)

47

So far we have derived a simple expression for the current model in the form of three

equations represented by Equations (16) and (17). These three equations are not

suitable for the derivation of the transfer function and need to be reduced into two

equations. For this, we define as

(18)

From Equations (17) and (18), we obtain

(19)

By using Eq. (19) into Eq. (16), we have

(20)

Where

,

,

,

Where is defined as earlier.

Now Eq. (20) leads to the desired system of two equations for the derivation of the

transfer function. For the boundary conditions at the lips, we add a fictitious

48

cylindrical tube of infinite length such that there is no negative-going

wave component. We, then, have (Markel and Gray, 1976; Rabiner and Shafer,

1978)

(21)

So Eq. (20) can be written for the lips as

(22)

Similarly, by taking the z-transformation of Eq. (15), the boundary conditions at the

glottis can be written as (Markel and Gray, 1976; Rabiner and Shafer, 1978)

(23)

The transfer function is evaluated by the relation

By combining Equations (20), (22) and (23), the transfer function is thus obtained as

(24)

Eq. (24) gives the transfer function of the current model in z-domain.

49

In our model of the vocal tract, represents the delay in a transverse direction

for divergence duct. For theevaluation of transfer function, we develop an expression

for in terms of z-variable. For this, we assume that the radii of the first and

second cylindrical tubes are and . Then, transverse delay time in the

cylinder tube denoted by can be written as:

Where (25)

Let

Where l is the length of the ith cylindrical tube. (26)

Where is a real number? (27)

When

(28)

Eq. (27) represents a transverse delay in terms of the delay

Let we introduce be the reflection coefficient at the wall, then on the wall of the

cylindrical tube (as shown in Fig. 2).

(29)

Eq. (29) can be rewritten as

(30)

Taking z-transformation of Eq. (30), we have

50

(31)

Which in view of Equation (18) give the following representation of in

terms of z-variable

(32)

This completes all the requirements for the evaluation of the transfer function. The

block diagram is shown in Fig. 3.

Figure3. Block Diagram of divergent duct.

Numerical Simulation:

Here, we give the numerical solution procedure that was adopted for solving a

waveguide model.

A waveguide model is found to give more accurate formant synthesis, producing

vowels that give a good match to the real-world targets. The current approach has its

advantage of better frequency formants than those of a digital waveguide and

comparable with a waveguide while maintaining its computational efficiency

comparable to that of a digital waveguide.

51

In this work, the length of the vocal tract has been chosen as 17.5 cm. The vocal tract

model has been divided into 10 equal cylindrical segments starting from the glottal

end in order to gain sampling frequency approximately 32 kHz for the speech. In all

simulations, boundary reflection, at lips and is chose as 0.90 respectively.

By using MATLAB 7.0 we derive the graphs and table of Vowels /a/, /e/, /i/, /o/ and

/u/ respectively.

Table:

List of the cross-sectional areas of five vowels given in cm2. The glottal end of each

area of vowel is at section 1 and lip end at section 10

Section /a/, /e/, /i/, /o/, /u/,

1 2.6 2.6 3.2 2.6 2.6

2 1.5855 2.001 2.5871 1.5616 2.6209

3 1.0995 1.4108 1.8044 0.9763 1.0589

4 1.8246 2.1091 2.991 3.4816 8.9055

5 3.8876 7.0104 8.4481 5.257 10.471

8

6 1.9417 6.3825 8.5665 4.0256 9.8715

7 1.2451 6.4014 10.8156 3.0124 8.026

8 0.7165 7.2167 10.4176 1.7726 5.6223

9 0.6466 7.8945 10.5203 1.4555 3.0957

10 0.6331 9.4523 10.4964 1.0436 1.3908

54

Vocal Tract Response:

In this section, we present the accuracy and efficiency of our current waveguide

model in the simulation of vowels and The comparison of its

formant frequencies and efficiency has been made with those of waveguide model.

The parameter k appearing in Eq. (1) determines the size of the imaginary cylinder

relative to the cylinder that carries an axial velocity component within the

cylinder. Its value varies with the variation of the vocal tract length and

the number of segments constituting the vocal tract. We have tested our model for

different values of the parameter k in the range 0 to 1. As we increase the value of

from 0, the formant frequencies of the proposed model start to match with that of the

standard model. It has been found that the best matching of formant frequencies of

the present model with the standard model is achieved at . Therefore, the

55

radius of the imaginary cylinder has been taken to be the same as that of the

cylinder in all the present simulations which corresponds to the choice of k = 1.

Table 1 represents cross-sectional areas of the 10 tubes, which constitute the vocal

tract for each vowel. These cross-sectional areas have been obtained by the spline

interpolation of the cross-sectional areas given in this topic.

References:

1. Beeson, M.J., Murphy, D.T., 2004. Room Weaver: a digital waveguide mesh

based room acoustics research tool. In: Proceedings of the Seventh

International Conference on Digital Audio Effects (DAFX-04), Naples, Italy,

pp. 268–273.

2. Campos, G.R., Howard, D.M., 2005. On the computational efficiency of

different waveguide mesh topologies for room acoustic simulation. IEEE

Trans. Speech Audio Process. 13, 1063–1072.

3. Cooper, C., Murphy, D., Howard, D., Tyrrell, A., 2006. Singing synthesis

with an evolved physical model. IEEE Trans. Audio Speech Lang.

Process.14, 1454–1461.

4. Kang, M.G., Lee, B.G., 1988. A generalized vocal tract model for pole-zero

type linear prediction. In: Proceedings of International Conference on

Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 687–690.

5. Kelly, J.L., Lochbaum, C.C., 1962. Speech synthesis. In: Proceedings of

Fourth International Congress on Acoustics, Copenhagen, Denmark, pp.1–4.

6. Morse, P.M., 1981. Vibration and Sound. American Institute of Physics, for

the Acoustical Society of America, pp. 1–468 (1948 1st edition 1936,last

author’s edition 1948, ASA edition 1981).

application of laplace transformation (cuts topic)

Documents