mathematics and information theory for engineers [5mm] …thomas m. cover and joy a. thomas:...

Mathematics and Information Theory for Engineers

Lecture

Sándor Baran

Academic year 2018/19, 2. semester

Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 1 / 239

Literature

Brian Davies: Integral Transforms and Their Applications. Springer,2002.

John J. D’Azzo, Constantine H Houpis, Stuart N. Sheldon: LinearControl System Analysis and Design with Matlab. Marcel Dekker,New York, 2003.

Thomas M. Cover and Joy A. Thomas: Elements of InformationTheory. Wiley, 2006.

Roberto Togneri and Christopher J. S. de Silva: Fundamentals ofInformation Theory and Coding Design. Chapman & Hall/CRC, 2006.

Results, informationarato.inf.unideb.hu/baran.sandor/miscen.html


Contents1 Matrix calculus2 Differential calculus of multivariable functions3 Numerical solution of optimization problems4 Integral calculus of multivariable functions5 Laplace transform and its applications6 Fourier transform and its properties7 Digital signals, z-transform8 Fundamentals of source coding, uniquely decodable and prefix codes9 Entropy and its properties. Block codes10 Universal source coding. Lempel-Ziv algorithms11 Quantization, sampling12 Transform coding. DPCM, Jayant quantizer, delta modulation,

predictors13 Audio and speech compression14 Image and video compression


MatricesDefinition. A rectangular array

A =

a11 a12 · · · a1na21 a22 · · · a2n... ... ...

am1 am2 · · · amn

of real or complex numbers arranged in m rows and n columns is called anm × n real or complex matrix, respectively. If n = m, then A in an n × nsquare matrix.

Basic matrix operations:addition and scalar (element of R or C) multiplication;matrix multiplication;transposition.


Special matricesDefinition A matrix A is symmetric if A⊤ = A.

Definition. An m × n matrix A =(aij)

is called diagonal if all entries out-side the main diagonal are zero, that is aij = 0, if i = j. Notation for n × nsquare matrices: A = diag(a11, a22, . . . , ann).The identity matrix of size n is the n × n diagonal square matrix in whichall entries of the main diagonal are equal to 1. Notation: In.

Definition. An m×n real matrix A is orthogonal, if A⊤A= In.Remark. The orthogonality of a matrix A means that its column vectorsa1, a2, . . . , an are orthonormal, that is

a⊤i aj =

{1, if i = j,0, if i = j,

i, j = 1, 2, . . . n.

Definition. The powers of an n × n square matrix A are defined as:A0 := In, A1 := A, An := An−1A, n ∈ N.


EigenvaluesDefinitions. Let A be an n × n square matrix. A scalar λ and a vectorx = 0 satisfying

Ax = λx

are called an eigenvalue and a corresponding eigenvector of A, respectively.

Equivalent formulation: Ax = λx ⇐⇒(A − λIn

)x = 0.

Remark. The system of homogeneous linear equations(A − λIn

)x = 0

has a non-trivial solution (x = 0) if and only if

det(A − λIn

)= 0.

Definition. The polynomial of degree n defined by

p(λ) := det(A − λIn

)is called the characteristic polynomial of A.


EigenvectorsAn n × n square matrix possesses n (not necessarily distinct) eigenvalues.

The eigenvector x =(x1, x2 . . . , xn

)⊤ corresponding to the eigenvalue λ ofa matrix A =

(aij)

is the solution of the homogeneous system of linearequations

a11 − λ a12 · · · a1na21 a22 − λ · · · a2n... ... ...

an1 an2 · · · ann − λ

x1x2...

xn

=

00...0

.

Remark. If x solves the above system, so does any non-zero multiple cx,that is, the eigenvectors are not unique.Unit eigenvector: ∥x∥2 :=

√x⊤x =

√|x1|2 + |x2|2 + · · ·+ |xn|2 = 1.

If x ∈ Rn then ∥x∥2 :=√

x⊤x =√

x21 + x2

2 + · · ·+ x2n = 1.


Example

A =

−1 −1 1−4 2 4−1 1 5

, A − λI3 =

−1 − λ −1 1−4 2 − λ 4−1 1 5 − λ

.

Characteristic polynomial:

p(λ) = det(A − λI3

)= −λ3 + 6λ2 + 4λ− 24.

Eigenvalues (roots of p(λ)): λ1 = 6, λ2 = −2, λ3 = 2.

The eigenvector corresponding to the eigenvalue λ1=6 solves(A−6I3

)x=0.

(A − 6I3

)x =

−7 −1 1−4 −4 4−1 1 −1

x1x2x3

=

000

⇐⇒

7 1 −10 1 −10 0 0

x1x2x3

=

000

.

The solution is parametric: x1 = 0, x2 = t, x3 = t, with 0 = t ∈ R arbitrary.

The unit eigenvector corresponding to λ1 = 6: x1 =(0, 1/

√2, 1/

√2)⊤.


MATLAB solution>> A=[-1 -1 1;-4 2 4;-1 1 5];>> [V L]=eig(A)

V =

0.0000 0.7071 0.40820.7071 0.7071 -0.81650.7071 0.0000 0.4082

L =

6.0000 0 00 -2.0000 00 0 2.0000

Eigenvalues: λ1 = 6, λ2 = −2, λ3 = 2.

Unit eigenvectors, respectively:

x1 =

01/

√2

1/√

2

=

0.00000.70710.7071

,

x2 =

1/√

21/

√2

0

=

0.70710.70710.0000

,

x3 =

1/√

6−2/

√6

1/√

6

=

0.4082−0.81650.4082

.

Further examples

B =

3 −1 23 −1 6−2 2 −2

, C =

−1 −1 14 2 4−1 1 5

.


Example, repeated eigenvalues

>> B=[3 -1 2;3 -1 6;-2 2 -2];>> [V L]=eig(B)

V =0.8890 -0.2673 0.07300.3810 -0.8018 0.9062-0.2540 0.5345 0.4166

L =2 0 00 -4 00 0 2

Eigenvalues: λ1 = λ3 = 2, λ2 = −4.

General form of eigenvectors:

x1=x3=

s − 2tst

, x2=

u3u−2u

,

where s, t, u ∈ R, u = 0, s2 + t2 = 0.

x1, x3 span a two-dimensional subspace. Onecan take any basis.

Unit eigenvectors corresponding to cases s = 0and t = 0, respectively:

x1 =(1/

√5)−2

01

, x2 =(1/

√14) 1

3−2

, x3 =(1/

√2)1

10

.


Example, complex eigenvalues>> C=[-1 -1 1;4 2 4;-1 1 5];>> [V L]=eig(C)

V =0.2132 - 0.4264i 0.2132 + 0.4264i 0.0000 + 0.0000i-0.8528 + 0.0000i -0.8528 + 0.0000i 0.7071 + 0.0000i0.2132 - 0.0000i 0.2132 + 0.0000i 0.7071 + 0.0000i

L =0.0000 + 2.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i0.0000 + 0.0000i 0.0000 - 2.0000i 0.0000 + 0.0000i0.0000 + 0.0000i 0.0000 + 0.0000i 6.0000 + 0.0000i

Eigenvalues: λ1 = 2i, λ2 = −2i, λ3 = 6.Unit eigenvectors:

x1 =(1/

√22)1 − 2i

−41

, x2 =(1/

√22)1 + 2i

−41

, x3 =(1/

√22)0

11

.


Properties of eigenvaluesTheorem. A symmetric n × n matrix possesses n real eigenvalues.Definition. The trace of an n × n square matrix A =

(aij)

is defined as

tr(A) = a11 + a22 + · · ·+ ann.

Theorem. Let λ1, λ2, . . . , λn be the eigenvalues of an n × n square mat-rix A. Then

tr(A) = λ1 + λ2 + · · ·+ λn =∑n

k=1 λk;det(A) = λ1 · λ2 · · · · · λn =

∏nk=1 λk.

Corollary. A square matrix A is singular, that is det(A) = 0, if 0 is aneigenvalue of A.

Theorem. Let λ1, λ2, . . . , λn be the eigenvalues of an n × n square mat-rix A. Then

the eigenvalues of Ak are λk1, λ

k2, . . . , λ

kn, k ∈ N;

if A is regular, then the eigenvalues of A−1 are 1/λ1, 1/λ2, . . . , 1/λn.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 13 / 239

Positive definite and positive semidefinite matricesDefinition. An n × n real matrix A is positive semidefinite, if for any vec-tor x ∈ Rn

x⊤Ax ≥ 0.A matrix A is positive definite, if for any non-zero vector 0 = x ∈ Rn

x⊤Ax > 0.

Theorem. For an n × n symmetric real matrix A the following statementsare equivalent:

A is positive definite;all principal minors of A are positive, that is

∆k := det(Ak

)> 0, k = 1, 2, . . . , n,

where Ak is the submatrix of A obtained by taking the upper left-hand corner k × k submatrix of A.the eigenvalues of A are positive.

Remark. The eigenvalues of a symmetric positive semidefinite matrix arenon-negative.


Negative definite and negative semidefinite matricesDefinition. An n × n real symmetric matrix A is negative definite or ne-gative semidefinite if all of its eigenvalues are negative or non-positive, res-pectively.

Theorem. Let A be an n × n symmetric real matrix and denote by ∆k thekth principal minor of A, k = 1, 2, . . . , n.

A is negative definite if and only if (−1)k∆k > 0, k = 1, 2, . . . , n.If (−1)k∆k > 0, k = 1, 2, . . . , n − 1, and ∆n = 0 then A is negativesemidefinite.If ∆k > 0, k = 1, 2, . . . , n − 1, and ∆n = 0 then A is positive semide-finite.

Definition. A symmetric real matrix is indefinite if it is neither positive,nor negative semidefinite (so it cannot be positive or negative definiteeither).

Theorem. For any real matrix A matrix A⊤A is positive semidefinite.A⊤A is positive definite if and only if the columns of A are linearly inde-pendent.


Matrix polynomials

Definition. Letp(x) = α0 + α1x + · · ·+ αkxk

be a real or complex polynomial and A be an n× n matrix. Then the valueof p(x) at A is defined as

p(A) := α0In + α1A + · · ·+ αkAk.

Cayley-Hamilton theorem. Let A be an n × n matrix and p(λ) be thecharacteristic polynomial of A. Then

p(A) = 0n,

where 0n is the n × n matrix with zero entries.


Matrix power seriesDefinition. Let

f(x) =∞∑

k=0αkxk

be a real or complex power series and A be an n × n matrix. Then

f(A) :=∞∑

k=0αkAk,

given it converges.

Examples. Let A be an n × n matrix.

1 exp(x) = ex =∞∑

k=0

xk

k! , that is exp(A) =∞∑

k=0

Ak

k! .

2 cos(x) =∞∑

k=0

(−1)kx2k

(2k)! , that is cos(A) =∞∑

k=0

(−1)kA2k

(2k)! .


Evaluation of matrix power series

A: an n × n matrix with characteristic polynomial p(x).f(x): an arbitrary polynomial or power series.

Aim: give the value of f(A) in a closed form.

Division with remainder:

f(x) = g(x) · p(x) + r(x), where deg[r(x)

]≤ n − 1.

Cayley-Hamilton theorem: p(A) = 0n, that is f(A) = r(A).

It suffices to determine the coefficients of

r(x) = β0 + β1x + · · ·βn−1xn−1.


Single eigenvalues

Assume that all roots of p(x) (eigenvalues of A) are single, that is

p(x) = (−1)n(x − λ1)(x − λ2) · · · (x − λn).

p(λi) = 0, so f(λi) = g(λi)p(λi) + r(λi) = r(λi), i = 1, 2, . . . , n.

To determine the coefficients of

r(x) = β0 + β1x + · · ·βn−1xn−1

one has to solve the system of linear equations

f(λi) = β0 + β1λi + · · ·βn−1λn−1i , i = 1, 2, . . . , n.


ExampleFind f(A) := exp(A) for

A =

−1 −1 1−4 2 4−1 1 5

, that is A2 =

4 0 0−8 12 24−8 8 28

.

Characteristic polynomial: p(x) = −x3 + 6x2 + 4x − 24 = −(x − 6)(x + 2)(x − 2).

Eigenvalues: λ1 = 6, λ2 = −2, λ3 = 2.

The form of the remainder polynomial r(x): r(x) := β0 + β1x + β2x2.

System of equations to be solved:

e6 = β0 + 6β1 + 36β2; e−2 = β0 − 2β1 + 4β2; e2 = β0 + 2β1 + 4β2.

Solution:

β0 =18(− e6 + 6e2 + 3e−2), β1 =

14(e2 − e−2), β2 =

132(e6 − 2e2 + e−2).

exp(A) =14

e2+3e−2 −e2+e−2 e2−e−2

−e6−2e2+3e−2 e6+2e2+e−2 3e6−2e2−e−2

−e6+e2 e6−e2 3e6−e2

.


Repeated eigenvaluesAssume that the roots of p(x) (eigenvalues of A) are:

λ1 : ℓ1-fold, λ2 : ℓ2-fold, · · · , λk : ℓk-fold, ℓ1 + ℓ2 + · · ·+ ℓk = n.Characteristic polynomial:

p(x) = (−1)n(x − λ1)ℓ1(x − λ2)

ℓ2 · · · (x − λk)ℓk .

p(λi) = p′(λi) = p′′(λi) = · · · = p(ℓi−1)(λi) = 0, i = 1, 2, . . . , k, sof(λi) =g(λi)p(λi)+r(λi)= r(λi);

f′(λi) =g′(λi)p(λi)+g(λi)p′(λi)+r′(λi)= r′(λi);

f′′(λi) =g′′(λi)p(λi)+2g′(λi)p′(λi)+g(λi)p′′(λi)+r′′(λi)= r′′(λi);

...f(ℓi−1)(λi) = · · · = r(ℓi−1)(λi), i = 1, 2, . . . , k.

The system of n equations of n variables to be solved:f(j)(λi) = r(j)(λi), i = 1, 2, . . . , k, j = 1, 2, . . . , ℓi.


ExampleFind f(B) := exp(B) for

B =

3 −1 23 −1 6−2 2 −2

, that is B2 =

2 2 −4−6 10 −12−4 −4 12

.

Characteristic polynomial: p(x) = −x3 + 12x − 16 = −(x − 2)2(x + 4).Eigenvalues: λ1 = λ2 = 2, λ3 = −4.The form of the remainder polynomial r(x): r(x) := β0+β1x+β2x2; r′(x) = β1+2β2x.System of equations to be solved:

e2 = β0 + 2β1 + 4β2; e2 = β1 + 4β2; e−4 = β0 − 4β1 + 16β2.

Solution:

β0 =19(− 4e2 + e−4), β1 =

19(4e2 − e−4), β2 =

136(5e2 + e−4).

exp(B) =

7e2−e−4

6−e2+e−4

6e2−e−2

3e2−e−4

2e2+e−4

2 e2 − e−4

−e2+e−4

3e2−e−4

3e2+2e−4

4

.


MATLAB solution

A =

−1 −1 1−4 2 4−1 1 5

, B=

3 −1 23 −1 6−2 2 −2

.

eA =

e2+3e−2

4−e2+e−2

4e2−e−2

4−e6−2e2+3e−2

4e6+2e2+e−2

43e6−2e2−e−2

4−e6+e2

4e6−e2

43e6−e2

4

=

1.9488 −1.8134 1.8134−104.4502 104.5856 298.8432−99.0099 99.0099 304.4189

.

eB =

7e2−e−4

6−e2+e−4

6e2−e−2

3e2−e−4

2e2+e−4

2 e2 − e−4

−e2+e−4

3e2−e−4

3e2+2e−4

4

=

8.6175 −1.2285 2.45693.6854 3.7037 7.3707−2.4569 2.4569 2.4752

.

>> A=[-1 -1 1;-4 2 4;-1 1 5];>> expA=expm(A)

expA =

1.9488 -1.8134 1.8134-104.4502 104.5856 298.8432-99.0099 99.0099 304.4189

>> B=[3 -1 2;3 -1 6;-2 2 -2];>> expB=expm(B)

expB =

8.6175 -1.2285 2.45693.6854 3.7037 7.3707

-2.4569 2.4569 2.4752


Singular-value decomposition of matricesTheorem. All m × n real matrices A can be decomposed as

A = UΣV⊤

called singular-value decomposition (SVD), where U and V are m×m andn×n orthogonal matrices, respectively, and Σ is an m×n dimensional dia-gonal matrix with real diagonal elements σ1 ≥ σ2 ≥ · · · ≥ σmin{m,n} ≥ 0called the singular values of A.Remark. The number of positive singular values coincide with the rank rof A. In this case the singular value decomposition equals

A =r∑

i=1σiuiv⊤i ,

where ui and vi the ith columns of matrices U and V, respectively.

Application:image processing: compression, noise reduction (deblurring);digital signal processing: noise reduction.


Spectral decomposition of symmetric matricesTheorem. Let A be an n × n symmetric real matrix with eigenvaluesλ1 ≥ λ2 ≥ · · · ≥ λn and corresponding orthogonal unit eigenvectorsq1,q2, . . . ,qn. The spectral decomposition of A can be written as

A =n∑

i=1λiqiq⊤

i .

The decomposition can be restated in matrix form

A = QΛQ⊤,

where Q is an orthogonal matrix having columns q1,q2, . . . ,qn, whereasΛ = diag(λ1, λ2, . . . , λn).

Remark. For symmetric positive definite matrices the spectral decompo-sition is identical to the singular-value decomposition.


MATLAB function svd

>> A=[1 2 3 4;5 6 7 8];

Full decomposition

>> [U,S,V]=svd(A)

U =-0.3762 -0.9266-0.9266 0.3762

S =14.2274 0 0 0

0 1.2573 0 0

V =-0.3521 0.7590 -0.4001 -0.3741-0.4436 0.3212 0.2546 0.7970-0.5352 -0.1165 0.6910 -0.4717-0.6268 -0.5542 -0.5455 0.0488

Parsimonious decomposition

>> [U,S,V]=svd(A,'econ')

U =-0.3762 -0.9266-0.9266 0.3762

S =14.2274 0

0 1.2573

V =-0.3521 0.7590-0.4436 0.3212-0.5352 -0.1165-0.6268 -0.5542


Moore-Penrose pseudoinverseDefinition. Let A be an m × n real matrix with singular-valuedecomposition

A = UΣV⊤.

The Moore-Penrose pseudoinverse of A is the n × m matrix

A+ = VΣ+U⊤,

where matrix Σ+ is obtained by taking the reciprocial of each non-zeroelement of the diagonal of Σ and then transposing the matrix.Theorem. Any matrix has a unique pseudoinverse.Theorem. Properties of the pseudoinverse:

AA+A = A and A+AA+ = A+;(AA+

)⊤= AA+ and

(A+A

)⊤= A+A;

if A is regular then A+ = A−1.


Solution of systems of linear equations

Consider a system of linear equations of form Ax = b, where A is anm × n real matrix, b ∈ Rm, x ∈ Rn.

General solution: x∗ = A+b.If the system of equations has a unique solution, x∗ is exactly thissolution.If the system of equations has several solutions, x∗ is a solution withthe smallest norm.If the system of equations is contradictory, that is it does not have asolution, then x∗ is a minimum point of ∥Ax − b∥2 with the smallestnorm.


MATLAB function pinv>> A=[1 2 3 4;5 6 7 8];>> A_inv=pinv(A)

A_inv =-0.5500 0.2500-0.2250 0.12500.1000 -0.00000.4250 -0.1250

>> B=[-1 -1 1;4 2 4;3 1 5];>> det(B)

ans =0

>> B_inv=pinv(B)

B_inv =-0.2115 0.1538 -0.0577-0.2179 0.1282 -0.08970.2372 -0.0513 0.1859

>> C=[-1 -1 1;4 2 4;-1 1 5];>> det(C)

ans =24

>> C_inv=pinv(C)

C_inv =0.2500 0.2500 -0.2500-1.0000 -0.1667 0.33330.2500 0.0833 0.0833

>> inv(C)

ans =0.2500 0.2500 -0.2500-1.0000 -0.1667 0.33330.2500 0.0833 0.0833


Differentiability of univariable functions

Definition. A function f : R 7→ R is said to be differentiable at a point xif the limit

limh→0

f(x + h)− f(x)h =: f′(x) = df

dx(x)

exists. f′(x) is called the derivative of f at point x.Remark. f is differentiable at x if and only if there exists a number a ∈ Rsuch that

limh→0

f(x + h)− f(x)− a · hh = 0.

In this case a = f′(x).Remark. Best linear approximation:

f(x + h) ≈ f(x) + f′(x) · h.


Differentiability of multivariable functionsDefinition. We say that f : Rn 7→ R is differentiable at a point x ∈ Rn ifthere exists a vector a ∈ Rn such that

limh→0

f(x + h)− f(x)− a⊤h∥h∥2

= 0.

Vector a is called the gradient of f at x and denoted by ∇f(x).

Definition We say that a function f : Rn 7→ R is partially differentiable atx = (x1, x2, . . . , xn) ∈ Rn with respect to xi if the limit

∂f∂xi

(x) := limh→0

f(x + hei)− f(x)h

= limh→0

f(x1, . . . , xi−1, xi + h, xi+1, . . . xn)− f(x1, . . . , xi−1, xi, xi+1, . . . xn)

h

exists. ∂f∂xi

is the ith partial derivative of f, i = 1, 2, . . . , n.


Second partial derivativesTheorem. If a function f : Rn 7→ R is differentiable at a point x ∈ Rn

then all partial derivatives of f exist and the gradient of f is

∇f(x) =(

∂f∂x1

(x), ∂f∂x2

(x), . . . , ∂f∂xn

(x))⊤

.

Remark. Taking the partial derivatives of the partial derivatives∂f∂xi

: Rn 7→ R, i = 1, 2, . . . , n, results in the second partial derivatives

∂2f∂xi∂xj

(x), i, j = 1, 2, . . . , n.

If the second partial derivatives are continuous at all x ∈ Rn then

∂2f∂xi∂xj

(x) = ∂2f∂xj∂xi

(x), i, j = 1, 2, . . . , n.


Hessian

Definition. If the second partial derivatives of f : Rn 7→ R exist then thematrix

∇2f(x) :=

∂2f∂x2

1(x) ∂2f

∂x1∂x2(x) · · · ∂2f

∂x1∂xn(x)

∂2f∂x2∂x1

(x) ∂2f∂x2

2(x) · · · ∂2f

∂x2∂xn(x)

... ... ...∂2f

∂xn∂x1(x) ∂2f

∂xn∂x2(x) · · · ∂2f

∂x2n(x)

is called the Hessian of f.

Remark. If the second partial derivatives are continuous then the Hessianis symmetric.


Example

Find the gradient and the Hessian of

f(x1, x2) := x31 + x3

2 − 3x1 − 3x2.

-4

2

1.5

-2

1

0.5

0 2

0

1.5-0.5 1

0.5-1 0

2

-0.5-1.5

-1

-1.5-2

-2

4

Solution.

Partial derivatives:∂f∂x1

(x1, x2)=3x21−3, ∂f

∂x2(x1, x2)=3x2

2−3.

Gradient:

∇f(x1, x2) =(3x2

1 − 3, 3x22 − 3

)⊤.

Second partial derivatives:

∂2f∂x2

1(x1, x2) = 6x1,

∂2f∂x1∂x2

(x1, x2) = 0,

∂2f∂x2

2(x1, x2) = 6x2,

∂2f∂x2∂x1

(x1, x2) = 0.

Hessian:

∇2f(x1, x2) =

(6x1 00 6x2

).


ExampleFind the gradient and the Hessian of

f(x1, x2, x3) := x1x2x3 − x3ex1 + x4

2x23.

MATLAB solution.

>> syms x1 x2 x3>> F=x1*x2*x3+x3*exp(x1)+x2^4*x3^2;>> GradF=jacobian(F)

GradF =[ x2*x3 + x3*exp(x1), 4*x2^3*x3^2 + x1*x3, 2*x3*x2^4 + x1*x2 + exp(x1)]

>> HesseF=jacobian(GradF)

HesseF =[ x3*exp(x1), x3, x2 + exp(x1)][ x3, 12*x2^2*x3^2, 8*x3*x2^3 + x1][ x2 + exp(x1), 8*x3*x2^3 + x1, 2*x2^4]


Jacobian

Definition. A vector valued function f = (f1, f2, . . . , fm)⊤ : Rn 7→ Rm isdifferentiable at a point x = (x1, x2, . . . , xn)⊤ ∈ Rn if the component func-tions fi, i = 1, 2, . . . ,m, are differentiable at x. In this case the m × nmatrix

f′(x) :=

∂f1∂x1

(x) ∂f1∂x2

(x) · · · ∂f1∂xn

(x)∂f2∂x1

(x) ∂f2∂x2

(x) · · · ∂f2∂xn

(x)... ... ...

∂fm∂x1

(x) ∂fm∂x2

(x) · · · ∂fm∂xn

(x)

is called the Jacobian of f.

Remark. The ith row of the Jacobian equals ∇fi(x)⊤.


ExampleFind the Jacobian of the functions

f(x1, x2) :=

e2x1+x2

x2 − x1x2

1 + x2

and g(y1, y2, y3) :=

(y1 + 2y2 + y2

3y2

1 + sin(y2 + y3)

).

MATLAB solution.

>> syms x1 x2>> f=[exp(2*x1+x2);x2-x1;x1^2+x2]

f =exp(2*x1 + x2)

x2 - x1x1^2 + x2

>> Jf=jacobian(f)

Jf =[ 2*exp(2*x1 + x2), exp(2*x1 + x2)][ -1, 1][ 2*x1, 1]

>> syms y1 y2 y3>> g=[y1+2*y2+y3^2;y1^2+sin(y2+y3)]

g =y3^2 + y1 + 2*y2

y1^2 + sin(y2 + y3)

>> Jg=jacobian(g)

Jg =[ 1, 2, 2*y3][ 2*y1, cos(y2 + y3), cos(y2 + y3)]


Taylor series expansionTaylor’s theorem. Let f : Rn 7→ R be a continuously differentiable (diffe-rentiable with continuous partial derivatives) function and let p ∈ Rn.Then we have

f(x + p) = f(x) +∇f(x + tp)⊤pfor some t ∈ (0, 1). Moreover, if f is twice continuously differentiable then

f(x + p) = f(x) +∇f(x)⊤p +12p⊤∇2f(x + tp)⊤p

for some t ∈ (0, 1).

First-order Taylor approximation around x:

f(x + p) ≈ f(x) +∇f(x)⊤p.

Second-order Taylor approximation around x:

f(x + p) ≈ f(x) +∇f(x)⊤p +12p⊤∇2f(x)⊤p.


ExampleFind the first- and second-order Taylor approximation of the function

f(x1, x2) := xx21 , x1 ∈ R+, x2 ∈ R,

around (1, 1).Solution. Partial derivatives:

∂f∂x1

(x1, x2) = x2xx2−11 ,

∂f∂x2

(x1, x2) = xx21 ln(x1);

∂2f∂x2

1(x1, x2) = x2(x2 − 1)xx2−2

1 ,∂2f∂x2

2(x1, x2) = xx2

1 ln2(x1),

∂2f∂x1∂x2

(x1, x2) = xx2−11

(1 + x2 ln(x1)

)=

∂2f∂x2∂x1

(x1, x2).

Gradient: ∇f(1, 1) = (1, 0)⊤; Hessian: ∇2f(1, 1) =(

0 11 0

).

First-order approximation:f(1 + p1, 1 + p2) ≈ f(1, 1) +∇f(1, 1)⊤(p1, p2)

⊤ = 1 + p1.

Second-order approximation:

f(1+p1, 1+p2) ≈ f(1, 1)+∇f(1, 1)⊤(p1, p2)⊤+

12 (p1, p2)∇2f(1, 1)(p1, p2)

⊤= 1+p1+p1p2.


Example

Approximate the value of the expression 1.011.005 using Taylor expansion.

Solution.

Function f(x1, x2) := xx21 .

First-order expansion around (1, 1): f(1 + p1, 1 + p2) ≈ 1 + p1.

Second-order expansion around (1, 1): f(1 + p1, 1 + p2) ≈ 1 + p1 + p1p2.

With p1 = 0.01, p2 = 0.005the first-order approximation: 1.011.005 ≈ 1 + 0.01 = 1.01;the second-order approximation: 1.011.005 ≈ 1 + 0.01 + 0.01 · 0.005 = 1.01005;true value (rounded to 15 decimals): 1.011.005 = 1.010050250420819.


Extremal points of multivariable functions

Let f : Rn 7→ R be a continuously differentiable function.x∗ is a global minimizer [maximizer] of the function f if f(x∗) ≤ f(x)[f(x∗) ≥ f(x)

]for all x ∈ Rn.

x∗ is a local minimizer [maximizer] of f if there is a neighborhoodN ⊂ Rn of x∗ such that f(x∗) ≤ f(x)

[f(x∗) ≥ f(x)

]for all x ∈ N .

x∗ is a strict local minimizer [maximizer] of f if there is a neighbor-hood N ⊂ Rn of x∗ such that f(x∗) < f(x)

[f(x∗) > f(x)

]for all

x∗ = x ∈ N .x∗ an isolated local minimizer [maximizer] of f if there is a neighbor-hood N ⊂ Rn of x∗ such that x∗ is the only local minimizer [maximi-zer] in N .


Stationary pointsTheorem. (First-order necessary conditions) If x∗ is a local extremal po-int (minimizer or maximizer) of f : Rn 7→ R and f is continuously differen-tiable in an open neighborhood of x∗, then ∇f(x∗) = 0.

Definition. We call x∗ a stationary point of f : Rn 7→ R if ∇f(x∗) = 0.

Definition. If x∗ is such a stationary point of f that is neither a maximi-zer, nor a minimizer, then x∗ is a saddle point.

-4

-3

2

-2

-1

0

1

1

2

3

2

4

0

1

-1 0

-1

-2-2

Example.

f(x1, x2) := x21 − x2

2.

∇f(x1, x2) =(2x1,−2x2

)⊤.

(0, 0) is the only stationary point,which is a saddle point.


Second-order conditions of extrema

Theorem. (Second-order necessary conditions) If x∗ is a local minimizer[maximizer] of f : Rn 7→ R and ∇2f exists and continuous in an openneighborhood of x∗, then

a) ∇f(x∗) = 0,b) ∇2f(x∗) is positive [negative] semidefinite.

Theorem. (Second-order sufficient conditions) Assume that ∇2f existsand continuous in an open neighborhood of x∗ and that

a) ∇f(x∗) = 0,b) ∇2f(x∗) is positive [negative] definite.

Then x∗ is a strict local minimizer [maximizer] of f.


Example

Find the extremal points of the function

f(x1, x2) := x31 + x3

2 − 3x1 − 3x2.

-4

2

1.5

-2

1

0.5

0 2

0

1.5-0.5 1

0.5-1 0

2

-0.5-1.5

-1

-1.5-2

-2

4

Solution.

Gradient:

∇f(x1, x2) =(3x2

1 − 3, 3x22 − 3

)⊤.

Stationary points:

(1, 1), (−1, 1), (1,−1), (−1,−1).

Hessian:

∇2f(x1, x2) =

(6x1 00 6x2

).

∇2f(1, 1) is positive definite, (1, 1) isa minimizer.∇2f(−1,−1) is negative definite,(1, 1) is a maximizer.∇2f(−1, 1) and ∇2f(1,−1) are inde-finite, (−1, 1) and (1,−1) are saddlepoints.


Optimization problemGiven a continuously differentiable function f : Rn 7→ R, find the value of

minx

f(x).

Examples.

f(x1, x2) := x31 + x3

2 − 3x1 − 3x2.

-4

2

1.5

-2

1

0.5

0 2

0

1.5-0.5 1

0.5-1 0

2

-0.5-1.5

-1

-1.5-2

-2

4

Rosenbrock function:

g(x1, x2) := 100(x2−x2

1)2+(1−x1

)2.


Optimization algorithmsGeneral form of an optimization algorithm:

1 specification of a starting point x0;2 determination of the strategy xk −→ xk+1;3 stopping criterion.

Line search: chooses a descent direction pk and searches along this direc-tion by solving the one-dimensional optimization problem

minα

f(xk + αpk).

Definition. An optimization algorithm with strategy xk −→ xk+1 conver-ges in order q to the optimal point x∗ if there exists a constant C>0 suchthat for some norm ∥ · ∥ inequality

∥xk+1 − x∗∥ ≤ C∥xk − x∗∥q

holds.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 48 / 239

Line search algorithmsOne should choose a search direction p such that function f should decre-ase along p.First-order Taylor approximation for some small α:

f(xk + αp) ≈ f(xk) + αp⊤∇f(xk).

The problem to be solved:

minp

p⊤∇f(xk), ∥p∥ = 1.

Asp⊤∇f(xk) = ∥p∥ · ∥∇f(xk)∥ cosΘ = ∥∇f(xk)∥ cosΘ,

the optimal direction corresponds to cosΘ = −1:

p = − ∇f(xk)

∥∇f(xk)∥

Steepest descent method (gradient method).Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 49 / 239

Example

-4

2

1.5

-2

1

0.5

0 2

0

1.5-0.5 1

0.5-1 0

2

-0.5-1.5

-1

-1.5-2

-2

4

f(x1, x2) := x31 + x3

2 − 3x1 − 3x2

Minimizer: (1, 1).Maximizer: (−1,−1).Saddle point: (1,−1) and (−1, 1).

xs

xmin

xmax

xs

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

−2

−1

0

1

2

Contour lines, stationary points and thenegative gradient field.


Descent directionsDefinition. A direction p ∈ Rn is a descent direction at x if

p⊤∇f(x) < 0.

If α is sufficiently small

f(xk + αp) ≈ f(xk) + αp⊤∇f(xk).

Asp⊤∇f(xk) = ∥p∥ · ∥∇f(xk)∥ cosΘ,

inequality cosΘ < 0 implies p⊤∇f(xk) < 0

Special case: steepest descent (gradient method):

p⊤ = −∇f(xk) azaz cosΘ = −1.

Problem: Gradient method can be very slow, if the optimum point laysin a long, prolate valley.


Effect of prolate valleys

xs

xmin

xmax

xs

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

−2

−1

0

1

2

Contour lines and the negative gradientfield of

f(x1, x2) := x31 + x3

2 − 3x1 − 3x2.

xs

xmin

xmax

xs

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

−2

−1

0

1

2

Contour lines and the negative gradientfield of

g(x1, x2) := 10x31 + x3

2 − 30x1 − 3x2.


Gradient method - step length

Given a search direction pk find the minimizer of the univariate function

α 7→ f(xk + αpk), (α > 0).

Finding the exact minimizer might be too expensive (too many functionevaluations, too large computation costs), hence it suffices to find an αwhich is “good enough”.

Two steps:Determine the maximal step length.In the given interval find an “good” α such that f(xk + αpk) issufficiently smaller than f(xk).


Ideal case: quadratic functionLet f : Rn 7→ R be quadratic, that is

f(x) := 12x⊤Qx − b⊤x.

Q: n × n-dimensional, symmetric, positive definite matrix; b ∈ Rn.In this case ∇f(x) = Qx − b, that is, for a stationary point x∗ we haveQx∗ = b.The optimal step length at point xk:

αk =∇f⊤(xk)∇f(xk)

∇f⊤(xk)Q∇f(xk),

that isxk+1 = xk −

(∇f⊤(xk)∇f(xk)

∇f⊤(xk)Q∇f(xk)

)∇f(xk).


Example: gradient method

−10 −8 −6 −4 −2 0 2 4 6 8 10−4

−3

−2

−1

0

1

2

3

4

Gradient method for f(x1, x2) := x21 + 10x2

2.


Gradient method with backtrackingBacktracking method for choosing αk:

1 Let c1 ∈ (0, 1), ϱ ∈ (0, 1) be fixed, α := α0;2 while f(xk + αpk) > f(xk) + αc1∇f(xk)⊤pk

α := ϱαend

3 αk = α;

Algorithm of the gradient method with backtracking:1 Let x0 be given.2 Given xk, let pk = −∇f(xk).3 Choose αk using the backtracking algorithm.4 Let xk+1 = xk + αkpk.5 Stopping criterion: ∇f(xk) = 0 (∥∇f(xk)∥ < ε).


Example: gradient method with backtracking

0.99913 0.99914 0.99915 0.99916 0.99917 0.99918 0.99919 0.9992 0.99921

0.99825

0.9983

0.99835

0.9984

Gradient method with backtracking for the Rosenbrock function, the last130 iteration steps. x0 = (−1.2, 1), ε = 10−3, ϱ = 0.5. Total number ofiteration steps: 5231.


Newton’s method for nonlinear equationsNonlinear equation:

f(x) = 0, where f : R 7→ R.

Newton’s iteration with starting point x0 for approximation of a root ofthe nonlinear equation f(x) = 0:

xk+1 = xk −f(xk)

f′(xk), k = 0, 1, 2, . . . .

Nonlinear system of equations:

F(x) = 0, where F : Rn 7→ Rn.

Newton’s iteration with starting point x0 for approximation of a root ofthe nonlinear system of equations F(x) = 0:

F′(xk)(xk+1 − xk

)= −F(xk), k = 0, 1, 2, . . .


Newton’s method for optimizationThe minimizer of f is a solution of the equation ∇f(x) = 0.

∇f : Rn 7→ Rn, so ∇f(x) = 0 is a nonlinear system of equations.

If f is twice continuously differentiable, then Newton’s method with start-ing point x0 for the system of equations ∇f(x) = 0:

∇2f(xk)(xk+1 − xk

)= −∇f(xk), k = 0, 1, 2, . . . .

The algorithm:let x0 be given;∇2f(xk)pk = −∇f(xk), that is pk = −

(∇2f(xk)

)−1∇f(xk);xk+1 = xk + pk.

Remark. Step length of Newton’s methods equals 1.


Newton directionsDefinition. Direction

pk = −(∇2f(xk)

)−1∇f(xk)

is called Newton direction.

Remark. If ∇2f(xk) is positive definite, then the Newton direction is adescent direction:

p⊤k ∇f(xk) = −∇f⊤(xk)

(∇2f(xk)

)−1∇f(xk) < 0.

If ∇2f(xk) is not positive definite, then the Newton direction might not bedefined, or might not be a descent direction.

Advantage: in a neighborhood of the minimizer the rate of conver-gence of the optimization method is quadratic.Disadvantage: it requires the knowledge of the Hessian.


Quasi-Newton directionIf the Hessian ∇2f(xk) is not known, or it is too expensive to determine,on can use an approximation Bk ≈ ∇2f(xk). It results in a quasi-Newtondirection.

Matrices Bk shouldsatisfy equation

Bk+1(xk+1 − xk

)= ∇f(xk+1)−∇f(xk);

should be symmetric;Bk and Bk+1 should have a low rank difference.

Most popular: Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm.The BFGS formula is

Bk+1 = Bk −Bksks⊤k Bk

s⊤k Bksk+

yky⊤ky⊤k sk

,

where sk = xk+1 − xk and yk = ∇f(xk+1)−∇f(xk).Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 61 / 239

Example: BFGS

x1

x2

−1.5 −1 −0.5 0 0.5 1 1.5−1

−0.5

0

0.5

1

1.5

2

BFGS for the Rosenbrock function, x0 = (−1.2, 1), ε = 10−4. Total number of iterationsteps: 36.


PartitionsLet D := [a, b]× [c, d] ⊂ R2 be a rectangular domain and f : D 7→ R be abounded function.Consider a partition of the intervals by points

a = x0 < x1 < x2 < · · · < xm−1 < xm = b,c = y0 < y1 < y2 < · · · < yn−1 < yn = d.

The partition P of D consists of the mn rectangles

Rij :={(x, y) ∈ R2 ∣∣ xi−1 ≤ x ≤ xi, yj−1 ≤ y ≤ yj

},

i = 1, 2, . . . ,m, j = 1, 2, . . . , n.

Area of Rij: ∆Aij := ∆xi∆yj = (xi − xi−1)(yj − yj−1).

Diameter of Rij: diam(Rij) :=√(xi − xi−1)2 + (yj − yj−1)2.

The norm of the partition P: ∥P∥ := max1≤i≤m1≤j≤n

diam(Rij).


Riemann sumsDefinition. The Riemann sum corresponding to a partition P of the rect-angle D and a set of arbitrary points

(x∗ij, y∗ij

)∈ Rij is defined as

R(f,P) :=m∑

i=1

n∑j=1

f(x∗ij, y∗ij

)∆Aij.

Source: Robert Adams, Christopher Essex: Calculus: AComplete Course, 7th Edition. Pearson, Toronto, 2010.

Assume f ≥ 0.f(x∗ij, y∗ij

)∆Aij: volume of the

box with base Rij and heightf(x∗ij, y∗ij

).

R(f,P): approximation of thevolume above D under thegraph of the function f.∫∫

D f(x, y)dxdy: the limit ofR(f,P) as ∥P∥ → 0 if thelimit exists independently ofthe choice of

(x∗ij, y∗ij

).


Double integral over a rectangleDefinition. Let f : D ⊂ R2 7→ R be a bounded function. We say that f isintegrable over the rectangle D and has double integral

I :=

∫∫D

f(x, y)dxdy

if for every refinig sequence Pk of partitions of D with limk→∞ ∥Pk∥ = 0and choices of points of the subrectangles of Pk the corresponding Rie-mann sums R(f,Pk) converge to I.Example Let D = [0, 1]2 and f(x, y) := 2x2 + xy.Consider the partition defined by lines x = 1/2 and y = 1/2 and choose the centres ofthe obtained squares, that is (1/4, 1/4), (1/4, 3/4), (3/4, 1/4) and (3/4, 3/4).

∫∫D(2x2 + xy)dxdy ≈

(2 · 1

16 +14 · 1

4

)14 +

(2 · 1

16 +14 · 3

4

)14

+

(2 · 9

16 +34 · 1

4

)14 +

(2 · 9

16 +34 · 3

4

)14 =

78 = 0.875.∫∫

D(2x2 + xy)dxdy =

1112 = 0.9167.


General bounded region

Definition. Let S ⊂ R2 be a bounded region, f : S 7→ R be a boundedfunction and fS : R2 7→ R be a function defined as

fS(x, y) :={

f(x, y), if (x, y) ∈ S,0, if (x, y) ∈ S.

Further, let D ⊂ R2 be a rectangle such that S ⊆ D. We say that f isintegrable over S if fS is integrable over D and the double integral of f overS is defined as ∫∫

Sf(x, y)dxdy :=

∫∫D

fs(x, y)dxdy.


Properties of the double integralTheorem. Let D ⊂ R2 be a bounded domain, f, g : D 7→ R be boundedfunctions and denote by λ(D) the area of D.

a) If λ(D) = 0 then ∫∫D

f(x, y)dxdy = 0.

b) ∫∫D

1dxdy = λ(D).

c) If f and g are integrable over D then αf + βg is also integrable and∫∫D

(αf(x, y) + βg(x, y)

)dxdy =α

∫∫D

f(x, y)dxdy

+ β

∫∫D

g(x, y)dxdy, α, β ∈ R.

d) If f and g are integrable over D and f(x, y) ≤ g(x, y) on D then∫∫D

f(x, y)dxdy ≤∫∫

Dg(x, y)dxdy.


Properties of the double integrale) If f is integrable over D then |f| is also integrable and∣∣∣∣∫∫

Df(x, y)dxdy

∣∣∣∣ ≤ ∫∫D

∣∣f(x, y)∣∣dxdy.

f) Let S ⊆ D. If f ≥ 0 is integrable both over S and D then∫∫S

f(x, y)dxdy ≤∫∫

Df(x, y)dxdy.

g) If D1,D2, . . . ,Dk are nonoverlapping domains on each of which f isintegrable, then f is integrable over the union

D := D1 ∪ D2 ∪ · · · ∪ Dk

and ∫∫D

f(x, y)dxdy ≤k∑

i=1

∫∫Di

f(x, y)dxdy.


Simple domainsDefinition. We say that the domain D ⊂ R is y-simple if it is bounded bytwo vertical lines x = a and x = b and two continuous graphs y = c(x) andx = d(x) between these lines. Similarly, D is x-simple if it is bounded byhorizontal lines y = c and y = d and continuous graphs x = a(y) andx = b(y).

xba

y

y=c(x)

y=d(x)

x

c

d

y

x=a(y)x=b(y)

y-simple domain x-simple domain


Iteration of double integralsTheorem. If f(x, y) is continuous on the bounded y-simple domain Ddefined by a ≤ x ≤ b and c(x) ≤ y ≤ d(x), then∫∫

Df(x, y)dxdy =

∫ b

a

[∫ d(x)

c(x)f(x, y)dy

]dx.

Similarly, if f(x, y) is continuous on the bounded x-simple domain D defi-ned by c ≤ y ≤ d and a(y) ≤ x ≤ b(y), then∫∫

Df(x, y)dxdy =

∫ d

c

[∫ b(y)

a(y)f(x, y)dx

]dy.

Remark. Instead of∫∫

D f(x, y)dxdy one can write∫∫

D f(x, y)dydx, bothexpressions stand for the double integral of f over D. The order of dx anddy is important when the double integral is iterated.


ExampleFind ∫∫

D(x2 + y)dxdy where S :=

{(x, y)

∣∣ 0 ≤ x ≤ 1, x2 ≤ y ≤√

x}.

Solution.∫∫D(x2 + y)dxdy =

∫ 1

0

[∫ √x

x2(x2 + y)dy

]dx =

∫ 1

0

[x2y +

y2

2

]y=√x

y=x2dx

=

∫ 1

0

(x5/2 +

x2 − 3

2x4)

dx =

[27x7/2 +

x2

4 − 32

x5

5

]1

0=

33140 .

MATLAB solution.

>> syms x y>> int(int((x^2+y),y,x^2,sqrt(x)),x,0,1)

ans =33/140


Improper integralsWe talk about improper double integrals if either the domain D ⊆ R2 ofintegration is unbounded or the interand f is unbounded near a point ofthe domain of integrations or its boundary.For f ≥ 0 or f ≤ 0 the integral is either exists (finite) or infinite.

Example. Evaluate∫∫

T1x4 e−y/xdxdy where T :=

{(x, y) ∈ R2 ∣∣ x ≥ 1, 0 ≤ y ≤ x

}.

Solution.∫∫T

1x4 e−y/xdxdy =

∫ ∞

1

∫ x

0

1x4 e−y/xdydx =

∫ ∞

1

1x4

[∫ x

0e−y/xdy

]dx

=

∫ ∞

1

1x4

[−xe−y/x

]y=x

y=0dx =

(1− 1

e

)lim

ϱ→∞

∫ ϱ

1

dxx3

=

(1− 1

e

)lim

ϱ→∞

[− 1

2x2

]ϱ1=

(1− 1

e

)lim

ϱ→∞

(12− 1

2ϱ2

)=

12

(1− 1

e

).

MATLAB solution.

>> syms x y>> f=exp(-y/x)/x^4;

>> int(int(f,y,0,x),x,1,Inf)

ans =1/2 - exp(-1)/2


Change of variablesLet f : D ⊆ R2 7→ R an integrable function and suppose x and y are ex-pressed as functions of two other variables u and v by the equations

x = x(u, v), y = y(u, v).These equations define a transformation from points (u, v) of the uv-planeto points (x, y) in the xy-plane.Definition. The Jacobian determinant of the transformation is defined as

J(u, v) := det( ∂x

∂u(u, v)∂x∂v(u, v)

∂y∂u(u, v)

∂y∂v(u, v)

).

Theorem. Let x = x(u, v), y = y(u, v) be a one-to-one transformationfrom domain S in the uv-plane onto a domain D in the xy-plane and assu-me that x and y are continuously differentiable on S. If f(x, y) is integ-rable over D then g(u, v) := f

((x(u, v), y(u, v)

)∣∣J(u, v)∣∣ is integrable over Sand ∫∫

Df(x, y)dxdy =

∫∫S

f((x(u, v), y(u, v)

)∣∣J(u, v)∣∣dudv.


Change to polar coordinatesEach point P with Cartesian coordinates (x, y) can be located by its polarcoordinates [r, θ].

[ r, ]θ

x

y

θ

r

●

(x,y)

r: distance from the origin.θ: angle with the positive direction ofthe x axis.

x = r cos θ, r =√

x2 + y2,

y = r sin θ, tan θ = y/x.

d

r+drr x

dr

dA

rd

θ

θ

θ

x

y y

x x+dx

dA

y+dy

ydy

dx

dA = dx dy: area ofa small rectangle inCartesian coordi-nates.dA ≈ r dr dθ: cor-responding area inpolar coordinates.


Integral transformation to polar coordinatesTransformation from polar coordinates [r, θ] to Cartesian coordinates (x, y):

x = r cos θ, y = r sin θ.

Jacobian determinant:

J(r, θ) = det(

cos θ −r sin θsin θ r cos θ

)= r.

Integral transformation:∫∫D

f(x, y)dxdy =

∫∫S

f(r cos θ, r sin θ

)r drdθ.

Examples for transformation of domains:

D :={(x, y)

∣∣ x2 + y2 ≤ a2} ⇐⇒ S :={[r, θ]

∣∣ 0 ≤ r ≤ a, 0 ≤ θ ≤ 2π};

D :={(x, y)

∣∣ b2 ≤ x2 + y2 ≤ a2, x, y ≥ 0}

⇐⇒ S :={[r, θ]

∣∣ b ≤ r ≤ a, 0 ≤ θ ≤ π/2}.


Example

Find∫∫

S xy dxdy, where S is the region in the first quadrant lying inside the disk withradius a and under the line y =

√3x.

Solution.∫∫S

xy dxdy =

∫ π/3

0

[∫ a

0r cos θ · r sin θ · r dr

]dθ

=

∫ π/3

0cos θ sin θ dθ

∫ a

0r3dr

=12

∫ π/3

0sin(2θ) dθ

[r44

]a

0

=a4

8

[−1

2 cos(2θ)]π/3

0

=a4

16(1 − cos(2π/3)

)=

332a4.

y= 3x

x

y

S

π/3

a

2 2x + y = a

2


Triple integrals

Let B := [a, b]× [c, d]× [p, q] ⊂ R3 be a rectangular box and f : B 7→ Rbe a bounded function. The triple integral∫∫∫

Bf(x, y, z)dxdydz

can be defined as the limit of Riemann sums corresponding to partitions ofB into subboxes. Properties are analogous to those of double integrals.

Let D ⊂ R3 be a bounded domain. Then∫∫∫D

1dxdydz = λ(D),

where λ(D) is the volume of D.


ExampleThree points are chosen randomly from the interval [a, b], a < b. Find the probabilitythat the third point lies between the first two chosen points.

Solution. Geometric probability. Denote by x, y and z the coordinates of the chosenpoints, respectively. These three values represent a single point (x, y, z) of the cubeC := [a, b]3 having volume λ(C) = (b − a)3. Points satisfying the conditions belong tothe set

S :={(x, y, z) ∈ [a, b]3

∣∣ x < z < y or y < z < x}.

The required probability: P = λ(S)/λ(C) = λ(S)/(b − a)3.

By symmetry:

λ(S) =∫∫∫

S1dxdydz = 2

∫ b

a

∫ y

a

∫ z

a1dxdzdy = 2

∫ b

a

∫ y

a[x]x=z

x=a dzdy

= 2∫ b

a

∫ y

a(z − a)dzdy = 2

∫ b

a

[(z − a)2

2

]z=y

z=ady =

∫ b

a(y − a)2dy

=

[(y − a)3

3

]y=b

y=a=

(b − a)33 .

Hence, P = 1/3.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 79 / 239

Change of variables for triple integralsLet f : D ⊂ R3 7→ R an integrable function and suppose x, y and z areexpressed as

x = x(u, v,w), y = y(u, v,w), z = z(u, v,w).These equations define a transformation from points (u, v,w) of theuvw-space to points (x, y, z) in the xyz-space. The Jacobian determinant:

J(u, v,w) := det

∂x∂u(u, v,w)

∂x∂v(u, v,w)

∂x∂w(u, v,w)

∂y∂u(u, v,w)

∂y∂v(u, v,w)

∂y∂w(u, v,w)

∂z∂u(u, v,w)

∂z∂v(u, v,w)

∂z∂w(u, v,w)

.

If the transformation is one-to-one from the domain S in the uvw-spaceonto a domain D in the xyz-space, the defining functions are continuouslydifferentiable and f : D ⊆ R3 7→ R is integrable over D then∫∫∫

Df(x, y, z)dxdydz =

∫∫∫S

g(u, v,w)∣∣J(u, v,w)∣∣dudvdw,

where g(u, v,w) := f(x(u, v,w), y(u, v,w), z(u, v,w)

).


Cylindrical coordinatesEach point P with Cartesian coordinates (x, y, z) can be located by itscylindrical coordinates [r, θ, z].

[ θ ]r, , z

x

z

y

❘●

●

z

y

x

rθ

d

(x, y, z)

r: distance from the origin ofthe projection in the xy-plane.

θ: angle with the positive di-rection of the x axis in thexy-plane.

x = r cos θ,y = r sin θ,z = z.


J(r, θ, z) = det

cos θ −r sin θ 0sin θ r cos θ 0

0 0 1

= r.


Spherical coordinatesEach point P with Cartesian coordinates (x, y, z) can also be located by itsspherical coordinates [ρ, ϕ, θ].

x

z

y

❘●

●

(x,y,z)

z

y

x

rθ

φ

ρ

[ρ,φ,θ]

ρ: distance from the origin.ϕ: angle with the positive di-rection of the z axis.θ: angle with the positive di-rection of the x axis in thexy-plane.

x = ρ sinϕ cos θ,y = ρ sinϕ sin θ,z = ρ cosϕ.


J(ρ, ϕ, θ) = det

sinϕ cos θ ρ cosϕ cos θ −ρ sinϕ sin θsinϕ sin θ ρ cosϕ sin θ ρ sinϕ cos θ

cosϕ −ρ cosϕ 0

= ρ2 sinϕ.


ExampleFind the integral∫∫∫

Sx2z2 dxdydz, where S :=

{(x, y, z)

∣∣ x, y > 0, x2 + y2 + z2 ≤ a2}, a > 0.

Solution. In the spherical coordinate system S can be expressed as

D :={[ρ, ϕ, θ]

∣∣ 0 ≤ ρ ≤ a, 0 ≤ ϕ ≤ π, 0 ≤ θ ≤ π/2}.

Thus ∫∫∫S

x2z2 dxdydz =

∫∫∫D

(ρ sinϕ cos θ

)2(ρ cosϕ

)2(ρ2 sinϕ

)dρ dϕ dθ

=

∫ π/2

0

∫ π

0

∫ a

0ρ6 sin3ϕ cos2θ cos2ϕ dρ dϕ dθ

=

(∫ a

0ρ6dρ

)(∫ π

0sin3ϕ cos2ϕ dϕ

)(∫ π/2

0cos2θ dθ

)

=a7

7

[115 cos3ϕ(3 cos2ϕ− 5)

]ϕ=π

ϕ=0

[θ

2 +14 sin(2θ)

]θ=π/2

θ=0

=a7

7415

π

4 =a7π

105 .


The Laplace integral

Definition. Let f : [0,∞[ 7→ R be an arbitrary function. Then the Lapla-ce integral

F(s) :=∫ ∞

0f(t)e−stdt

is called the Laplace transform of f(t) provided the integral exists.

Remarks.a) s is a complex number of form s = σ + iω, where i2 = −1.b) In practical problems t usually represents the time, so f(t) is a

time-dependent quantity.c) The usual shorthand notation for the Laplace integral of f is

L[f]:= F(s).


ExamplesUnit step function u−1(t) (Heaviside step function)

u−1(t) :={

1, t > 0;0, t < 0.

L[u−1(t)

]=

∫ ∞

0u−1(t)e−stdt =: U−1(s).

U−1(s) =∫ ∞

0e−stdt =

[−e−st

s

]t=∞

t=0=

1s if σ > 0.

Decaying exponential e−αt (α > 0)

L[e−αt] =∫ ∞

0e−αte−stdt =

∫ ∞

0e−(s+α)tdt =

[−e−(s+α)t

s + α

]t=∞

t=0=

1s + α

if σ > −α.

Simple periodic function eiωt (ω ∈ R)

L[eiωt] =∫ ∞

0eiωte−stdt =

∫ ∞

0e−(s−iω)tdt =

[−e−(s−iω)t

s − iω

]t=∞

t=0=

1s − iω if σ > 0.


ExamplesSinusoid cos(ωt) (0 < ω ∈ R)

L[

cos(ωt)]=

∫ ∞

0cos(ωt)e−stdt, where cos(ωt) = eiωt + e−iωt

2 .

L[

cos(ωt)]=

12

∫ ∞

0

(eiωt + e−iωt)e−stdt = 1

2

[∫ ∞

0eiωte−stdt +

∫ ∞

0e−iωte−stdt

]=

12

[1

s − iω +1

s + iω

]=

ss2 + ω2 if σ > 0.

Ramp function u−2(t) := tu−1(t)

L[u−2(t)

]=

∫ ∞

0u−2(t) e−stdt =

∫ ∞

0te−stdt =: U−2(s).

Integration by parts

U−2(s) =∫ ∞

0te−stdt =

[− te−st

s

]t=∞

t=0−∫ ∞

0

(−e−st

s

)dt

= 0 −[−e−st

s2

]t=∞

t=0=

1s2 if σ > 0.


Properties of the Laplace transform ITheorem (Linearity). If α and β are constants or are independent of sand t, and f(t) and g(t) are transformable with Laplace transforms F(s)and G(s), respectively, then

L[αf(t) + βg(t)

]= αL

[f(t)

]+ βL

[g(t)

]= αF(s) + βG(s).

Theorem (Translation in time). If the Laplace transform of f(t) is F(s)and a is a positive real number, then the Laplace transform of the transla-ted function f(t − a)u−1(t − a) is

L[f(t − a)u−1(t − a)

]= e−asF(s).

Theorem (Complex differentiation). If the Laplace transform of f(t) isF(s), then

L[tf(t)

]= − d

dsF(s).


Examples1.

L[

cos(ωt)]=

ss2 + ω2 , so L

[t cos(ωt)

]= − d

ds

(s

s2 + ω2

)=

s2 − ω2

(s2 + ω2)2.

2.L[e−αt] = 1

s + αso L

[te−αt] = − d

ds

(1

s + α

)=

1(s + α)2

.

3. UsingL[u−1(t)

]=

1s

one has

L[u−2(t)

]= L

[tu−1(t)

]= − d

ds

(1s

)=

1s2 ;

L[u−3(t)

]= L

[tu−2(t)

]= L

[t2u−1(t)

]− d

ds

(1s2

)=

2s3 =

2!s3 ;

L[u−4(t)

]= L

[tu−3(t)

]= L

[t3u−1(t)

]− d

ds

(2s3

)=

6s4 =

3!s4 .

In general:L[u−(n+1)(t)

]= L

[tnu−1(t)

]=

n!sn+1 .


Properties of the Laplace transform IITheorem (Translation in the s domain). If the Laplace transform of f(t)is F(s) and a is a complex number, then

L[eatf(t)

]= F(s − a).

Theorem (Real differentiation). If the Laplace transform of f(t) is F(s)and f′(t) = d

dt f(t) = Df(t) is transformable, thenL[f′(t)

]= sF(s)− f(0+),

where f(0+) is the right limit of f at 0, that isf(0+) := lim

x→0x>0

f(t).

The transform of the second derivative f′′(t) = d2

dt2 f(t) = D2f(t) is

L[f′′(t)

]= s2F(s)− sf(0+)− f′(0+).

In general, the transform of the nth derivative f(n)(t)= dndtn f(t)=Dnf(t) is

L[f(n)(t)

]=snF(s)−sn−1f(0+)−sn−2f′(0+)−· · ·−sf(n−2)(0+)− f(n−1)(0+).


Properties of the Laplace transform IIITheorem (Real integration). If the Laplace transform of f(t) is F(s), itsintegral

D−1f(t) :=∫ t

0f(s)ds + D−1f(0+), where D−1f(0+) := lim

t→0t>0

∫ t

0f(s)ds,

is transformable and its Laplace transform is

L[D−1f(t)

]=

F(s)s +

D−1f(0+)s .

The transform of the second integral is

L[D−2f(t)

]=

F(s)s2 +

D−1f(0+)s2 +

D−2f(0+)s .

In general,

L[D−nf(t)

]=

F(s)sn +

D−1f(0+)sn + · · ·+ D−nf(0+)

s .

In what follows, instead of f(0+) simply f(0) will be used.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 91 / 239

Examples1.

L[

cos(ωt)]=

ss2 + ω2 , and sin(ωt) = − 1

ωcos′(ωt).

Thus

L[

sin(ωt)]= − 1

ωL[

cos′(ωt)]= − 1

ω

(sL[

cos(ωt)]− cos(0)

)= − 1

ω

(s2

s2 + ω2 − 1)

=ω

s2 + ω2 .

2.L[e−αt cos(ωt)

]=

s + α

(s + α)2 + ω2 .

3. Laplace transformation by MATLAB

>> syms A W t>> f=sin(W*t);>> F=laplace(f,t)

F =W/(W^2 + t^2)

>> g=exp(-A*t)*cos(W*t);

>> G=laplace(g,t)

G =(A + t)/((A + t)^2 + W^2)


Properties of the Laplace transform IVTheorem (Final value). If f(t) and f′(t) are Laplace transformable, theLaplace transform of f(t) is F(s) and the limit of f(t) as t → ∞ exists, then

lims→0

F(s) = limt→∞

f(t).

Theorem (Initial value). If f(t) and f′(t) are Laplace transformable, theLaplace transform of f(t) is F(s) and the limit of sF(s) as s → ∞ exists,then

lims→∞

sF(s) = limt→0

f(t).

Theorem (Complex integration). If the Laplace transform of f(t) is F(s)and is f(t)/t has a limit as t → 0, t > 0, then

L[

f(t)t

]=

∫ ∞

0F(s)ds.


Example: series resistor-inductor-capacitor circuitv(t) or u(t): voltage at time t (V: Volt); I(t): current at time t (A: Amper)

Kirchhoff’s laws1 In traversing any closed loop, the sum of the voltage rises equals the sum of

voltage drops.2 The sum of currents entering the junction equals the sum of currents leaving it.

I(t)

+

u(t)

C

L

R

Element Quantity Voltage dropResistor (R) Resistance (Ohm) vR = RIInductor (L) Inductance (Henry) vL = L dI

dt =: LDICapacitor (C) Capacity (Farad) vC=

1C∫ t

0 I(τ)dτ+ Q0C =: I

CD

Q0: initial value of the charge of the capacitor.u(t): input voltage.

Kirchhoff’s law: vR(t) + vL(t) + vC(t) = u(t).

Corresponding equation: RI + LDI + ICD = u.

Output y(t): voltage drop at the capacitor vC(t).Equation for the output:

RCDy(t) + LCD2y(t) + y(t) = u(t) ⇐⇒ LC y′′(t) + RC y′(t) + y(t) = u(t).Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 94 / 239

Application of the Laplace transformSecond-order linear differential equation for the output voltage:

LC y′′(t) + RC y′(t) + y(t) = u(t).

Laplace transform of the equation:

L[LC y′′(t) + RC y′(t) + y(t)

]= L

[u(t)

].

Laplace transforms of the terms:

L[y(t)

]= Y(s), L

[RC y′(t)

]= RC

(sY(s)− y(0)

),

L[u(t)

]= U(s), L

[LC y′′(t)

]= LC

(s2Y(s)− sy(0)− y′(0)

).

Substitute the values into the equation:(LCs2 + RCs + 1

)Y(s)−

(LC sy(0) + LC y′(0) + RC y(0)

)= U(s).

Y(s) = 1LCs2 + RCs + 1︸︷︷︸

System transfer function

U(s) + LC sy(0) + LC y′(0) + RC y(0)LCs2 + RCs + 1︸︷︷︸

Initial condition component

.

Laplace transform of the solution:

Y(s) = U(s) + LC sy(0) + LC y′(0) + RC y(0)LCs2 + RCs + 1 .


Solution of the equationOriginal equation: LC y′′(t) + RC y′(t) + y(t) = u(t).Laplace transform of the solution:

Y(s) = U(s) + LC sy(0) + LC y′(0) + RC y(0)LCs2 + RCs + 1 .

The solution y(t) can be found by applying the inverse transform L−1,

y(t) = L−1[Y(s)] = L−1[

U(s) + LC sy(0) + LC y′(0) + RC y(0)LCs2 + RCs + 1

].

After inserting numerical values one can use table of Laplace transform pairs.Assume at t = 0 a direct current source of 1 V is turned on, that is y(0) = y′(0) = 0and u(t) is the unit step function, that is u(t) = u−1(t).

y(t) = L−1[Y(s)] = L−1

[1/(LC)

s(s2 + (R/L)s + 1/(LC)

)] = L−1

[ω2

s(s2 + 2ζωs + ω2

)] ,where ω := 1/

√LC and ζ :=

(R/2)

√C/L. From the table of Laplace transform pairs:

y(t) = 1 − e−ζωt√1 − ζ2

sin(ω√

1 − ζ2t + arccos ζ).


Convolution integrals

Definition. Let f1, f2 : [0,∞[ 7→ R be arbitrary integrable functions. Thenthe convolution of f1 and f2 is defined as

f(t) := f1(t) ∗ f2(t) :=∫ ∞

0f1(τ)f2(t − τ)dτ.

Theorem. If f1(t) and f2(t) are transformable with Laplace transformsF1(s) and F2(s), respectively, then

L[f1(t) ∗ f2(t)

]= F1(s) · F2(s).


Inverse transformIn many cases the Laplace transform F(s) of f(t) can be expressed as

F(s) = P(s)Q(s) =

aksk + ak−1sk−1 + · · ·+ a1s + a0sn + bn−1sn−1 + · · ·+ b1s + b0

, k < n.

Definition. The poles of F(s) are the roots s1, s2, . . . , sn of Q(s), givenP(si) = 0, i = 1, 2, . . . , n (no common roots of P(s) and Q(s)).One has to use the partial-fraction expansion of F(s). Four cases:

1 F(s) has first-order real poles.2 F(s) has multiple-order real poles.3 F(s) has a pairs of complex-conjugate poles.4 F(s) has repeated pairs of complex-conjugate poles.

Required Laplace transform pairs:

L−1[

1(s − α)n

]=

tn−1eαt

(n − 1)! .


First order real poles

F(s) = P(s)Q(s) =

P(s)(s − s1)(s − s2) · · · (s − sn)

=A1

s − s1+

A2s − s2

· · · Ans − sn

.

f(t) = L−1[F(s)] = A1es1t + A2es2t + · · ·+ Anesnt, t ≥ 0.

Example. Find the inverse Laplace transform of

F(s) = 10s3 + 7s2 + 10s .

Solution.

F(s) = 10s(s + 2)(s + 5) =

1s +

23 · 1

s + 5 − 53 · 1

s + 2 , f(t) = 1 +23e−5t − 5

3e−2t.

MATLAB solution>> syms s>> F=10/(s^3+7*s^2+10*s)

F =10/(s^3 + 7*s^2 + 10*s)

>> ilaplace(F,s)

ans =(2*exp(-5*s))/3 - (5*exp(-2*s))/3 + 1


Multiple-order real poles

F(s) = P(s)(s − s1)q1(s − s2)q2 · · · (s − sr)qr

=A11

s − s1+ · · ·+ A1q1

(s − s1)q1

+A21

s − s2+ · · ·+ A2q2

(s − s2)q2+ · · ·+ Ar1

s − sr+ · · ·+ Arqr

(s − sr)qr.

f(t) =L−1[F(s)] = A11es1t + A12tes1t · · ·+ A1q1tq1−1

(q1 − 1)!es1t

+ · · ·+ Ar1esrt + Ar2tesrt · · ·+ Arqrtqr−1

(qr − 1)!es1t, t ≥ 0.


F(s) = s2 + s + 1s4 + 5s3 + 9s2 + 7s + 2 .

Solution.

F(s) = s2 + s + 1(s + 1)3(s + 2) =

3s + 1 − 2

(s + 1)2 +1

(s + 1)3 − 3s + 2 .

f(t) = 3e−t − 2te−t +t2

2 e−t − 3e−2t, t ≥ 0.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 100 / 239

Pairs of complex-conjugate poles

F(s) = P(s)(s2+2ζωs+ω2)(s−s3) · · · (s−sn)

=A1

s−s1+

A2s−s2

+ · · ·+ Ans−sn

,

s1 =−ζω+iω√

1−ζ2, s2 =−ζω−iω√

1−ζ2 ∈ C, s3, . . . , sn ∈ R.

F(s) = A1

s+ζω−iω√

1−ζ2+

A2

s+ζω+iω√

1−ζ2+

A3s−s3

+ · · ·+ Ans−sn

.

f(t) = A1e(−ζω+iω√

1−ζ2)t + A2e(−ζω−iω√

1−ζ2)t + A3es3t + · · ·+ Anesnt.

A1,A2 ∈ C: complex conjugates; A2, · · · ,An ∈ R.

f(t) = 2|A1|e−ζωt sin(ω√

1 − ζ2t + ϕ)+ A3es3t + · · ·+ Anesnt

= 2|A1|eσt sin(ωdt + ϕ

)+ A3es3t + · · ·+ Anesnt,

where σ := −ζω, ωd := ω√

1 − ζ2 and ϕ− π/2 is the angle of A1.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 101 / 239

ExampleFind the inverse Laplace transform of

F(s) = 36ss3 + 6s2 + 21s + 26 .

Solution.

F(s) = 36s(s2 + 4s + 13)(s + 2) =

36s(s + 2 − 3i)(s + 2 + 3i)(s + 2)

=4 − 6i

s + 2 − 3i +4 + 6i

s + 2 + 3i −8

s + 2 .

f(t) = (4 − 6i)e(−2+3i)t + (4 + 6i)e(−2−3i)t − 8e−2t

= 4e−2t(

e3it+e−3it)

︸︷︷︸2 cos(3t)

−6ie−2t(

e3it−e−3it)

︸︷︷︸2i sin(3t)

−8e−2t = 4e−2t(2 cos(3t)+3 sin(3t)− 2)

=4√

13e−2t(

2√13

cos(3t)+ 3√13

sin(3t))−8e−2t

= 4√

13e−2t( sinϕ cos(3t) + cosϕ sin(3t))−8e−2t = 4

√13e−2t sin(3t + ϕ)−8e−2t,

where tanϕ = 2/3, so ϕ = 33.69◦.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 102 / 239

Imaginary poles

F(s) = P(s)(s2 + ω2)(s−s3) · · · (s−sn)

=A1

s−s1+

A2s−s2

+ · · ·+ Ans−sn

=A1

s−iω +A2

s+iω +A3

s−s3+ · · ·+ An

s−sn, s3, . . . , sn ∈ R.

f(t) = A1eiωt + A2e−iωt + A3es3t + · · ·+ Anesnt

= 2|A1| sin(ωt + ϕ

)+ A3es3t + · · ·+ Anesnt, t ≥ 0.


F(s) = 20(s2 + 16)(s + 2) .

MATLAB solution>> syms s>> F=20/(s^2+16)/(s+2)

F =20/((s^2 + 16)*(s + 2))

>> ilaplace(F,s)

ans =exp(-2*s) - cos(4*s) + sin(4*s)/2


Repeated pairs of complex-conjugate polesRepeated pairs of complex-conjugate poles are treated similar to multiple-order real poles.Example. Find the inverse Laplace transform of

F(s) = 324s(s2 + 4s + 13)2(s + 2) =

324s(s + 2 − 3i)2(s + 2 + 3i)2(s + 2) .

Solution.

F(s) = 4 − 3is + 2 − 3i +

4 + 3is + 2 + 3i −

9 + 6i(s + 2 − 3i)2 − 9 − 6i

(s + 2 + 3i)2 − 8s + 2 .

f(t) = (4−3i)e(−2+3i)t+ (4+3i)e(−2−3i)t− (9+6i)te(−2+3i)t− (9−6i)te(−2−3i)t− 8e−2t

= e−2t((4 − 9t)

(e3it+e−3it

)− (3 + 6t)i

(e3it−e−3it

)− 8)

= 2e−2t((4 − 9t) cos(3t) + (3 + 6t) sin(3t)− 4)= 10e−2t

(45 cos(3t) + 3

5 sin(3t))

− 6√

13te−2t(

3√13

cos(3t)− 2√13

sin(3t))− 8e−2t

= 10e−2t sin(3t + ϕ) + 6√

13te−2t sin(3t − ψ)− 8e−2t, tanϕ =43 , tanψ =

32 .


The Fourier transformDefinition. Let f : R 7→ R be an arbitrary function. Then the (exponen-tial) Fourier transform of f(t) is defined by the integral

F[f(t)

]:= f(ω) :=

∫ ∞

−∞f(t)eiωtdt

for those values of ω where the integral exists.

Remark. Let f be absolutely integrable, that is∫ ∞

−∞

∣∣f(t)∣∣dt < ∞.

Then ∣∣∣∣∫ ∞

−∞f(t)eiωtdt

∣∣∣∣ ≤ ∫ ∞

−∞

∣∣f(t)eiωt∣∣dt ≤∫ ∞

−∞

∣∣f(t)∣∣dt < ∞,

so the Fourier transform of f exists.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 106 / 239

Connection to the Laplace transformLet F±(s) denote the following Laplace transforms:

F±(s) :=∫ ∞

0f(±t)e−stdt.

Then for the Fourier transform f(ω) of f(t) we have

f(ω) = F+(−iω) + F−(iω).Example. Let f(t) := e−α|t|, t ∈ R, α > 0.

f(ω) =∫ ∞

−∞e−α|t|eiωtdt =

∫ 0

−∞e(α+iω)tdt +

∫ ∞

0e(−α+iω)tdt

=

[e(α+iω)t

α+ iω

]t=0

t=−∞+

[e(−α+iω)t

−α+ iω

]t=∞

t=0=

1α+ iω − 1

−α+ iω =2α

α2 + ω2 .

Corresponding Laplace transforms:

F+(s) = L[e−αt] = 1

s + α, F−(s) = L

[e−αt] = 1

s + α.

F+(−iω) + F−(iω) =1

−iω + α+

1iω + α

=2α

α2 + ω2 = f(ω).


Connection to probability theoryDefinition. Let f(t) be the probability density function (PDF) of an abso-lutely continuous random variable X, that is

P(X ≤ x

)=

∫ x

−∞f(t)dt.

Then the Fourier transform f(ω) of f is called the characteristic function ofX.Remark. The characteristic function can be considered as the meanE[eiωX] of eiωX, since

f(ω) =∫ ∞

−∞f(t)eiωtdt = E

[eiωX].

Example. φ(t): PDF of the standard normal distribution.

φ(t) := 1√2π

e−t2/2, φ(ω) = e−ω2/2.


Inverse Fourier transformTheorem. Assume that f is absolutely integrable and the same holds for f.Then f is continuous and

f(t) = 12π

∫ ∞

−∞f(ω)e−iωtdω

for all t, that is the function is uniquely defined by its Fourier transform.Remark. Fourier transformation gives a link between a function f(t) inthe time domain to its representation f(ω) in the frequency domain.

Theorem (Parseval equality). Assume that both f and f are absolutelyintegrable. Then ∫ ∞

−∞

∣∣f(t)∣∣2dt = 12π

∫ ∞

−∞

∣∣f(ω)∣∣2dω.

Further, if f and g are absolutely integrable functions such that the sameholds for the corresponding Fourier transforms f and g. Then∫ ∞

−∞f(t)g(t)dt = 1

2π

∫ ∞

−∞f(ω)g(ω)dω.


Physical meaningf(t): signal in the time domain (e.g. voltage across a resistor).Total energy contained in f(t) summed across all of time t:

Ef :=

∫ ∞

−∞

∣∣f(t)∣∣2dt.

Total energy of the Fourier transform f(ω) summed across all of itsfrequency components ω:

12π

∫ ∞

−∞

∣∣f(ω)∣∣2dω =

∫ ∞

−∞

∣∣f(2πν)∣∣2dν.

∣∣f(ω)∣∣2: energy spectral density of f.Parseval equality: the total energy contained in a signal f(t) summedacross all of time t is equal to the total energy of its Fourier transformf(ω) summed across all of its frequency components ω.


ExampleRectangular function (pulse function) of duration 2T

f(t) :={

1, |t| ≤ T;

0, |t| > T.f(ω) =

∫ T

−Teiωtdt =

[eiωt

iω

]t=T

t=−T=

2 sin(ωT)

ω.

f(ω) = 2 sin(ωT)

ω= 2T sinc

(ωTπ

), where sinc(x) := sin(πx)

πx , x ∈ R.

−10 −5 0 5 10−0.2

0

0.2

0.4

0.6

0.8

1

Time domain

t

f(t)

−3 −2 −1 0 1 2 3−4

−2

0

2

4

6

8

10

Frequency domain

ω

f(ω

)

Pulse function and its Fourier transform for T = 5.


ExampleBandpass filter. Let 0 < w < W and

f(ω) ={

1, |ω| ∈ [w,W];

0 otherwise.f(t) = sin(Wt)

πt − sin(wt)πt .

−10 −5 0 5 10−0.2

0

0.2

0.4

0.6

0.8

1

Frequency domain

ω

f(ω

)

−3 −2 −1 0 1 2 3

−0.5

0

0.5

1

1.5

Time domain

t

f(t)

Bandpass filter and its inverse Fourier transform for w = 1, W = 5.


Properties of the Fourier transform, ITheorem (Linearity). If α and β are constants or are independent of ωand t and the Fourier transforms F

[f(t)

]= f(ω) and F

[g(t)

]= g(ω) of

f(t) and g(t), respectively, exist, then

F[αf(t) + βg(t)

]= αF

[f(t)

]+ βF

[g(t)

]= αf(ω) + βg(ω).

Theorem (Derivative). If f(t) is continuous and piecewise differentiableand f′(t) is absolutely integrable then

f′(ω) = iωf(ω).

Theorem (Translation). If τ, a ∈ R, then

F[f(t − τ)] = eiωτF

[f(t)] = eiωτ f(ω),

F[eiatf(t)] = f(a + ω).


Properties of the Fourier transform, IITheorem (Multiplication by t). Assume that tf(t) is absolutely integrab-le. Then

F[tf(t)

]= i d

dω f(ω).

Theorem (Similarity). Let 0 = a ∈ R. Then

F[f(t/a)

]= |a|f(aω).

Theorem (Convolution). Let f(t) be the convolution integral of two ab-solutely integrable functions f1(t) and f2(t), that is

f(t) := f1(t) ∗ f2(t) :=∫ ∞

−∞f1(τ)f2(t − τ)dτ.

Thenf(ω) = f1 ∗ f2(ω) = f1(ω) · f2(ω).


Discrete-time Fourier transformf[n], n ∈ Z: a discrete-time (digital) signal.

f[n] can be obtained e.g. by sampling from a continuous function g(t),t ∈ R by sample time T, that is f[n] = g(nT).

Definition. The discrete-time Fourier transform of a digital signalf : Z 7→ R is defined as

F(ω) := F{

f[n]}:=

∞∑n=−∞

f[n]eiωn, −π ≤ ω ≤ π.

The corresponding discrete-time inverse Fourier transform is

f[n] = 12π

∫ π

−πF(ω)e−iωndω.


Discrete Fourier transformf[n], n = 0, 1, . . . ,N − 1: a finite duration discrete-time signal.

Definition The discrete Fourier transform of a finite duration discrete-time signal f[n], n = 0, 1, . . . ,N − 1, is defined as

F[k] :=N−1∑n=0

f[n]eikω0n, ω0 :=2πN , k = 0, 1, . . . ,N − 1.

The corresponding inverse discrete Fourier transform is

f[n] := 1N

N−1∑k=0

F[k]e−ikω0n, n = 0, 1, . . . ,N − 1.

Remark. ω0 = 2πN is the fundamental frequency (one cycle per sequence,

1N Hz, 2π

N rad/s). We also consider the harmonics 2ω0, 3ω0, . . . , (N−1)ω0,and the DC component 0 = 0 · ω0.


Matrix representationDiscrete Fourier transform of f[n], n = 0, 1, . . . ,N − 1:

F[k] :=N−1∑n=0

f[n]ei 2πN kn, k = 0, 1, . . . ,N − 1.

Matrix representation:F[0]F[1]F[2]

...F[N − 1]

=

1 1 1 1 · · · 11 W W2 W3 · · · WN−1

1 W2 W4 W6 · · · WN−2

1 W3 W6 W9 · · · WN−3

... ... ...1 WN−1 WN−2 WN−3 · · · W

f[0]f[1]f[2]...

f[N − 1]

,

where W := ei 2πN is the Nth complex unit root.

Fast Fourier transform (FFT) algorithms are based on the special structureof the multiplier matrix.


ExampleConsider the continuous signal:

f(t) = 6︸︷︷︸DC

+ 2 cos(2πt − π/2

)︸︷︷︸1Hz

+ 3 cos(4πt)︸︷︷︸

2Hz

, t ≥ 0.

0 0.5 1 1.5 2 2.5−4

−2

0

2

4

6

8

10

t

f(t)

Sample f(t) with frequency 4Hz (i.e. 4times a second, sampling time T = 1/4)from t = 0 to t = 3/4 (N = 4 values).

Sampled signal at time pointst = nT = n/4, n = 0, 1, 2, 3:

f[n] = 6 + 2 cos(πn/2 − π/2

)+ 3 cos

(πn).

f[0] = 9, f[1] = 5, f[2] = 9, f[3] = 1.

Fourier transform:

F[k] =3∑

n=0f[n]ei π2 kn, k = 0, 1, 2, 3.


Example

Continuous signal: f(t) = 6 + 2 cos(2πt − π/2

)+ 3 cos

(4πt), t ≥ 0.

Sampled signal: f[n] = 6 + 2 cos(πn/2 − π/2

)+ 3 cos

(πn), n = 0, 1, 2, 3.

Fourier transform: F[k] =3∑

n=0f[n]ei π2 kn, k = 0, 1, 2, 3.

W := ei π2 = i, so W2 = −1, W3 = −i, W4 = 1.

Matrix representation:F[0]F[1]F[2]F[3]

=

1 1 1 11 W W2 W3

1 W2 W4 W6

1 W3 W6 W9

f[0]f[1]f[2]f[3]

=

1 1 1 11 i −1 −i1 −1 1 −11 −i −1 i

9591

=

244i12−4i

.

The magnitudes∣∣F[k]∣∣ of the coefficients: 24, 4, 12, 4.


z-transform of digital signalsDefinition. Let f : Z 7→ C be a discrete-time (digital) signal such thatf[n] = 0 for n = −1,−2, . . .. The (unilateral) z-transform of f is defined as

F(z) := Z{

f[n]}:=

∞∑n=0

f[n]z−n

for those values of z ∈ C where the series is convergent.Remark. z-transformation is a one-to-one correspondence between f[n]and Z

{f[n]

}.

Example. Let a ∈ C and f[n] := an, n = 0, 1, 2, . . ., and f[n] = 0, n = −1,−2, . . ..

Z{

an} =∞∑

n=0anz−n =

∞∑n=0

(az)n

=z

z − a .

Special case (unit step function): f[n] ≡ 1 = 1n =: u[n], n = 0, 1, 2, . . ., andf[n] = 0, n = −1,−2, . . ..

Z{

u[n]}=

zz − 1 .


Motivation: sampling

t

t

t

pf*(t)

γ1

f(t)

p(t)

T γ

f(t): continuous signal.

p(t): sampling pulse train with mag-nitude 1/γ and period T.Area corresponding to each pulseequals 1.

Sampled function:

f∗p(t) = p(t) · f(t).

γ → 0: Dirac impulse train (idealsampler)

δT(t) :=∞∑

n=−∞δ(t − nT).

δ(t): Dirac delta function.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 122 / 239

Dirac delta functionu−1(t): unit step function.δ(t, γ): pulse signal with magnitude 1/γ and length γ

δ(t, γ) := u−1(t)− u−1(t − γ)

γ.

Delta function:

δ(t) := limγ→0γ>0

δ(t, γ) = limγ→0γ>0

u−1(t)− u−1(t − γ)

γ.

Formally:

δ(t) ={∞ t = 0;0 t = 0,

or δ(t) = ”u ′−1(t)”.


Properties of the Dirac delta functionf(t): continuous signal.∫ ∞

−∞f(τ)δ(τ − t0)dτ =

∫ t2

t1f(τ)δ(τ − t0)dτ = f(t0), t0 ∈ R,

for all intervals t0 ∈ [t1, t2] ⊆ R. Special case:∫ ∞

−∞δ(τ)dτ =

∫ t2

t1δ(τ)dτ =

{1, 0 ∈ [t1, t2];0, otherwise.

Laplace transform:

F(s, γ) := L[δ(t, γ)

]=

1γ

(L[u−1(t)

]− L

[u−1(t − γ)

])=

1 − e−γs

γs .

F(s) := L[δ(t)

]= lim

γ→0γ>0

F(s, γ) = limγ→0γ>0

1 − e−γs

γs = 1.

The Laplace transform of the Dirac delta function: L[δ(t)

]= 1.


Laplace transform of the sampled signalf(t): continuous signal with f(t) = 0 for t < 0.

δT(t): ideal sampler.

Sampled signal:

f∗δT(t) = f(t) · δT(t) =∞∑

n=0f(t)δ(t − nT).

Laplace transform of the sampled signal:

F∗δT(s) := L

[f∗δT(t)

]=

∫ ∞

0f(t)δT(t)e−stdt

=∞∑

n=0

∫ ∞

0f(t)δ(t − nT)e−stdt =

∞∑n=0

f(nT)e−snT.

Let z = esT, so s = 1T ln z. Then

F∗δT(z) := F∗

δT

( 1T ln z

)=

∞∑n=0

f(nT)z−n = Z{

f[nT]}.


Properties of the z-transform, ITheorem. Let f, g : Z 7→ C be digital signals with f[n] = g[n] = 0 forn = −1,−2, . . ., and denote by F(z) and G(z) the corresponding z-trans-forms.

(Linearity) If α and β are constants or are independent of z and nthen

Z{αf[n] + βg[n]

}= αZ

{f[n]

}+ βZ

{g[n]

}= αF(z) + βG(z).

(Convolution) Let h[k] denote the convolution of f[k] and g[k], that isfor k = 0, 1, 2, . . . we have

h[k] = (f ∗ g)[k] :=∞∑

ℓ=−∞f[ℓ]g[k − ℓ] =

k∑ℓ=0

f[ℓ]g[k − ℓ].

Then

H(z) := Z{

h[k]}= Z

{f[k]

}· Z

{g[k]

}= F(z) · G(z).


Properties of the z-transform, IITheorem. Let f : Z 7→ C be a digital signal with f[n] = 0 forn = −1,−2, . . ., and denote by F(z) the z-transform of f.

(Delay property)

Z{

f[n − 1]}= z−1Z

{f[n]

}= z−1F(z).

In general, for k ∈ N we have

Z{

f[n − k]}= z−kZ

{f[n]

}= z−kF(z).

(Advance property)

Z{

f[n + 1]}= zZ

{f[n]

}− zf[0] = zF(z)− zf[0].

In general, for k ∈ N we have

Z{

f[n + k]}= zkZ

{f[n]

}−

k−1∑ℓ=0

f[ℓ]zk−ℓ = zkF(z)−k−1∑ℓ=0

f[ℓ]zk−ℓ.


Examples1. The discrete Dirac function or Kronecker delta

δ0[n] :={

1, n = 0,0, n = 0.

The corresponding z-transform:

Z{δ0[n]

}=

∞∑n=0

δ0[n]z−n = 1.

2. Find the z-transform of f[n] := nan, n = 0, 1, 2, . . . , 0 = a ∈ C.

Solution.Z{

an} =∞∑

n=0anz−n =

zz − a

By taking the derivatives of both sides with respect to z one has∞∑

n=1−nanz−n−1 = − a

(z − a)2 ⇐⇒∞∑

n=1nanz−n =

za(z − a)2 .

ThusZ{

nan} =za

(z − a)2 .


Examples1. Let 0 = a ∈ C, 0 ≤ k ∈ Z, and f[n] :=

(nk)an.

Z{

f[n]}= Z

{(nk

)an

}=

zak

(z − a)k+1 .

Special case: a = 1.

Z

{(nk

)}=

z(z − 1)k+1 .

More special cases (k = 1, 2):

Z{

n}=

z(z − 1)2 , Z

{n(n − 1)/2

}=

z(z − 1)3 .

HenceZ{

n2} =2z

(z − 1)3 +z

(z − 1)2 =z(z + 1)(z − 1)3 .

2. z-transformation by MATLAB>> syms n k a z>> f=nchoosek(n,k)*a^n;>> F=ztrans(f,n,z)

F =piecewise([k == 0, z/(a*(z/a - 1))], [0 < k, z/(a*(z/a - 1)^(k + 1))], [k < 0, 0])


Example: the Fibonacci sequenceFibonacci sequence: every number after the first two is the sum of the two precedingones.

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, . . .f[n]: the n-th term of the Fibonacci sequence.The recurrence equation defining the sequence:

f[n + 2] = f[n + 1] + f[n], f[0] = f[1] = 1.The solution of the above equation gives the closed form of f[n].

z-transform of the equation:Z{

f[n + 2]}= Z

{f[n + 1] + f[n]

}⇐⇒ Z

{f[n + 2]

}= Z

{f[n + 1]

}+ Z

{f[n]}.

Let F(z) := Z{

f[n]}

. z-transforms of the other components:Z{

f[n + 1]}= zF(z)− zf[0] = zF(z)− z;

Z{

f[n + 2]}= z2F(z)− z2f[0]− zf[1] = z2F(z)− z2 − z.

Substitute the expressions into the equation:z2F(z)− z2 − z = zF(z)− z + F(z) ⇐⇒

(z2 − z − 1

)F(z) = z2.

z-transform of the solution:F(z) = z2

z2 − z − 1 .


Example: solution of the Fibonacci recurrence equationRecurrence equation:

f[n + 2] = f[n + 1] + f[n], f[0] = f[1] = 1.z-transform Z

{f[n]}=: F(z) of the solution:

F(z) = z2

z2 − z − 1 = z z(z − 1+

√5

2

)(z − 1−

√5

2

) .Partial fraction decomposition:

F(z) = z(√

5 + 12√

51

z − 1+√

52

+

√5 − 12√

51

z − 1−√

52

)

=

√5 + 12√

5z

z − 1+√

52

+

√5 − 12√

5z

z − 1−√

52

.

As Z{

an} = zz−a for a ∈ C, we have

F(z) =√

5 + 12√

5Z{(

1 +√

52

)n}+

√5 − 12√

5Z{(

1 −√

52

)n}.

The solution is

f[n] =√

5 + 12√

5

(1 +

√5

2

)n+

√5 − 12√

5

(1 −

√5

2

)n.


Linear difference equations with constant coefficientsThe Fibonacci recurrence equation

f[n + 2]− f[n + 1]− f[n] = 0

is a special linear difference equations with constant coefficients.General form of a linear difference equation of order k with constantcoefficients:

akf[n + k] + ak−1f[n + k − 1] + · · ·+ a1f[n + 1] + a0f[n] = g[n],

where a0, a1, . . . , ak ∈ C, ak = 0, and g[n] is a given digital signal.Homogeneous equation: g[n] ≡ 0.Initial conditions:

f[0] = b0, f[1] = b1, . . . , f[k − 1] = bk−1, b0, b1, . . . , bk−1 ∈ C.


General solutionEquation: akf[n+k] + ak−1f[n+k−1] + · · ·+ a1f[n+1] + a0f[n] = g[n].Initial conditions: f[0] = b0, f[1] = b1, . . . , f[k − 1] = bk−1.z-transform of the equation:

akZ{

f[n+k]}+ ak−1Z

{f[n+k−1]

}+ · · ·+ a0Z

{f[n]}= Z{g[n]}.

Let F(z) := Z{

f[n]}

and G(z) := Z{

g[n]}

. z-transforms of the components:

Z{

f[n + j]}= z jZ

{f[n]}−

j−1∑ℓ=0

z j−ℓf[ℓ] = z jF(z)−j−1∑ℓ=0

z j−ℓbℓ, j = 1, 2, . . . , k

The resulting form of the transformed equation:

F(z)( k∑

j=0ajz j)−

k∑j=1

aj

j−1∑ℓ=0

z j−ℓbℓ = G(z);

F(z)( k∑

j=0ajz j)− z

k−1∑ℓ=0

( k∑j=ℓ+1

ajbj=ℓ−1

)z ℓ = G(z).

z-transform of the solution:

F(z) =( k∑

j=0ajz j)−1

(G(z) + z

k−1∑ℓ=0

( k∑j=ℓ+1

ajbj=ℓ−1

)z ℓ

).


ExampleSolve the following difference equation:

f[n + 2]− 2f[n + 1]− 3f[n] = 6n + 6, f[0] = 0, f[1] = 1.

Solution. Let F(z) := Z{

f[n]}

. The z-transform of g[n] = 6n + 6 is

G(z) := Z{

g[n]}= 6(

z(z − 1)2 +

zz − 1

)=

6z2

(z − 1)2 .

Further,

Z{

f[n + 1]}= zF(z)− zf[0] = zF(z);

Z{

f[n + 2]}= z2F(z)− z2f[0]− zf[1] = z2F(z)− z.

The transformed equation:

z2F(z)− z − 2zF(z)− 3F(z) = 6z2

(z − 1)2 ⇐⇒(z2 − 2z − 3

)F(z) = z + 6z2

(z − 1)2 .

z-transform of the solution:

F(z) = z (z − 1)2 + 6z(z2 − 2z − 3)(z − 1)2 = z z2 + 4z + 1

(z − 3)(z + 1)(z − 1)2 .


Example

The difference equation to be solved:

f[n + 2]− 2f[n + 1]− 3f[n] = 6n + 6, f[0] = 0, f[1] = 1.

Partial fraction decomposition of the z-transform of the solution:

F(z) = z z2 + 4z + 1(z − 3)(z + 1)(z − 1)2 =

118

zz − 3 +

18

zz + 1 − 3

2z

z − 1 − 32

z(z − 1)2 .

AsZ{

n}=

z(z − 1)2 and Z

{an} =

zz − a , a ∈ C,

the solution isf[n] = 11

8 3n +18 (−1)n − 3

2 − 32n.


Fundamental notions of information theorySource alphabet: A finite set X = {x1, x2, . . . , xn}, n ≥ 2. Elements ofthe source alphabet are called source symbols. Can be considered as valu-es of a discrete random variable X called source.X ∗: set of strings of symbols from X . Elements of X ∗ are called sourcemessages.Code alphabet: Finite set Y = {y1, y2, . . . , ys}, s ≥ 2. Elements of Y arecalled code symbols.Y∗: set of strings of symbols from Y. Elements of Y∗ are called codemessages.Encoding or code: a mapping f : X → Y∗. s = 2: binary code.The range K = f(X ) of the mapping f is also referred as code. Elements ofK are called codewords.A code f is called variable-length code if the corresponding code words areof different lengths.


Uniquely decodable codesDefinition. A code f : X → Y∗ is uniquely decodable, if for all u ∈ X ∗,v ∈ X ∗ where u = u1u2 . . . uk, v = v1v2 . . . vℓ and u = v, one has

f(u1)f(u2) . . . f(uk) = f(v1)f(v2) . . . f(vℓ).In other words, any encoded string in a uniquely decodable code has onlyone possible source string producing it.

Examples.1. X ={a, b, c}, Y={0, 1} and f(a)=1, f(b)=01, f(c)=10110.The encoder f is non-singular, that is every element of X maps into adifferent string in Y∗, but the code is not uniquely decodable. Forinstance, f(c)f(a) = 101101 = f(a)f(b)f(a)f(b).2. X ={a, b, c}, Y={0, 1} and f(a)=1, f(b)=10, f(c)=100.The code is uniquely decodeble as 1 always indicates the first bit of a newcodeword.3. X ={a, b, c}, Y={0, 1} and f(a)=1, f(b)=00, f(c)=01.The code is uniquely decodable.


Prefix codesDefinition. A code f is called a prefix code or an instantaneous code if nocodeword is a prefix of any other codeword.Remarks.1. Prefix codes are uniquely decodable.2. A code with code words of fixed length is uniquely decodable if allcodewords are different.

Example.1. X ={a, b, c}, Y={0, 1} and f(a)=1, f(b)=00, f(c)=01.The code is prefix.2. X ={a, b, c}, Y={0, 1} and f(a)=1, f(b)=10, f(c)=100.The code is not prefix, but uniquely decodable.3. X = {a, b, c, d, e, f, g}, Y = {0, 1, 2} and f(a)=0, f(b)=10,f(c)=11, f(d)=20, f(e)=21, f(f)=220, f(g)=221.The code is prefix.


Code treesPrefix codes can be represented by s-ary trees where the branches of thetree represent the symbols of the corresponding codewords. Then each co-deword is represented by a leaf on the tree. The path from the root tracesout the symbols of the codeword.For binary codes one deals with binary trees where e.g. 0 corresponds to abranch going “up”, whereas 1 corresponds to a branch going “down”.

Example. Give the corresponding code trees.1. Y = {0, 1}, K = {0, 100, 1010, 1011, 110, 111}.2. Y = {0, 1, 2}, K = {0, 10, 11, 20, 21, 220, 221}.

Codeword lengths∣∣f(x)∣∣: codeword length of the code f(x) of the source symbol x ∈ X . Inwhat follows, L denotes the set of codeword lengths of a code f.Codeword lengths cannot be arbitrary. E.g. there is no binary code of asource alphabet of 4 letters having codeword lengths {1, 2, 2, 2}.


McMillan and Kraft inequalitiesTheorem (McMillan). For any uniquely decodable code f : X → Y∗ overan alphabet of size s, inequality

n∑i=1

s−|f(xi)| ≤ 1

holds.

Theorem (Kraft). If the positive integers L1, L2, . . . , Ln satisfyn∑

i=1s−Li ≤ 1,

then there exists a prefix code f such that∣∣f(xi)∣∣ = Li, i = 1, 2, . . . , n.

Remark. McMillan and Kraft inequalities imply that for any uniquely de-codable code there exists a prefix code having the same codeword lengths.Thus, it suffices to consider only prefix codes.


Measure of information I.Hartley (1928): the identification of a particular element of a finite set Xof n elements requires

I = log2 namount of information.Heuristics. If n = 2k, then the elements of X can be represented with binary sequencesof length k = log2 n. If log2 n ∈ Z, then the number of required binary digits is thesmallest integer not smaller than log2 n

(⌈log2 n

⌉). The binary representation of a block

of elements of X of length m (the number of such blocks is nm), requires a length k sa-tisfying 2k−1 < nm ≤ 2k. Thus, the average length K = k/m of the representation of asingle symbol of X satisfies log2 n < K ≤ log2 n + 1/m. In this way, the lower boundlog2 n can be arbitrarily approximated.

The formula defines the information content as the lower bound of thelength of binary representations. The information content is measured inbits. The identification of the symbols of a set of two elements requires 1bit of information.Problem: Hartley assumes that all elements of X are equally likely.


Measure of information II.Shannon (1948): The amount of information provided by the occurrenceof an event A with probability P(A) equals

I(A) = log21

P(A) = − log2 P(A).

Heuristics. Requirements on the amount of information I(A).If P(A) ≤ P(B), then I(A) ≥ I(B).Corollary: I(A) depends only on the probability P(A), that is I(A) = g

(P(A)

).

In case of mutual occurrence of two independent events the amounts of informa-tion should be added, that is if P(A · B)=P(A)P(B), then I(A · B)= I(A)+I(B).Hence, g(p · q) = g(p) + g(q), p, q ∈]0, 1].If P(A) = 1/2, then I(A) := 1, that is g(1/2) = 1.

Theorem. If g : [0, 1] → R is a function satisfyinga) g(p) ≥ g(q), if 0 < p ≤ q ≤ 1;b) g(p · q) = g(p) + g(q), p, q ∈]0, 1];c) g(1/2) = 1,

theng(p) = log2

1p , p ∈]0, 1].


Measure of information III.Connection between the two definitions:If all elements of X are equally likely (occur with probability 1/n), theneach element provides log2 n information.

Remark. In what follows, for a ≥ 0 and b > 0 we have

0 log20a = 0 log2

a0 = 0; b log2

b0 = +∞; b log2

0b = −∞.

X: a discrete random variable with alphabet X .p(x): the probability of the source symbol x ∈ X , that is

p(x) := P(X = x), x ∈ X .

The average amount of information provided by a single value of X isn∑

i=1p(xi)I(X = xi) = −

n∑i=1

p(xi) log2 p(xi) = E(− log2 p(X)

).


Entropy, average codeword lengthDefinition. The entropy H(X) of a discrete random variable X with ran-ge (alphabet) X = {x1, x2, · · · , xn} is defined as

H(X) := E(− log2 p(X)

)= −

n∑i=1

p(xi) log2 p(xi).

Remark. The same formula defines the entropy H(X ) of a source alpha-bet X with distribution

P :={

p(x1), p(x2), . . . , p(xn)},

that isH(X ) := −

n∑i=1

p(xi) log2 p(xi).

Definition. The average codeword length E(f) of a code f : X → Y∗ isdefined as

E(f) := E∣∣f(X)∣∣ = n∑

i=1p(xi)

∣∣f(xi)∣∣.


Examples1. X ={a, b, c}, Y={0, 1}, and f(a)=1, f(b)=00, f(c)=01.Distribution: p(a)=0.6, p(b)=0.3, p(c)=0.1.Short notation: s=2, K={1, 00, 01}, L={1, 2, 2}, P={0.6, 0.3, 0.1}.

H(X) = −0.6 · log2 0.6 − 0.3 · log2 0.3 − 0.1 · log2 0.1 ≈ 1.295;E(f) = 0.6 · 1 + 0.3 · 2 + 0.1 · 2 = 1.4

2. s = 2, L = {1, 3, 3, 3, 4, 4}, P ={1

2 ,18 ,

18 ,

18 ,

116 ,

116}

.

H(X) = −12 · log2

12 − 3 · 1

8 · log218 − 2 · 1

16 · log2116 = 2.125;

E(f) = 12 · 1 + 3 · 1

8 · 3 + 2 · 116 · 4 = 2.125.

Aim: to determine the lower bound of the average codeword length as theshorter the average codeword length, the better the code.Find the code f minimizing the function

E(f) =n∑

i=1p(xi)

∣∣f(xi)∣∣ given

n∑i=1

s−|f(xi)| ≤ 1.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 146 / 239

Bounds of the average codeword lengthTheorem (Shannon’s noiseless coding theorem). For any uniquely deco-dable code f : X → Y∗ we have

E(f) =n∑

i=1p(xi)

∣∣f(xi)∣∣ ≥ −

n∑i=1

p(xi) logs p(xi) =H(X )

log2 s ,

where equality holds if and only if p(xi) = s−|f(xi)|, i = 1, 2, . . . , n.If p(xi) = s−Li , where Li ∈ N, then there exists a prefix code f such that∣∣f(xi)

∣∣ = Li, i = 1, 2, . . . , n, and

E(f) = H(X )

log2 s .

For any distribution of the source alphabet X , there exists a prefix codef : X → Y∗ such that

E(f) < H(X )

log2 s + 1.


Block codesSource messages are split into blocks of length m and then these block areencoded.Formal definition: a mapping f : Xm → Y∗.Block encoding is the encoding of the source alphabet X := Xm.Source block: a random vectorX = (X1,X2, . . . ,Xm). Distribution of X:

p(x) = p(x1, x2, . . . , xm) = P(X1 = x1,X2 = x2, . . . ,Xm = xm).

Entropy:H(X) = −

∑x∈Xm

p(x) log2 p(x).

If X1,X2, . . . ,Xm are independent, then H(X) =∑m

i=1 H(Xi).If X1,X2, . . . ,Xm are independent and identically distributed (i.i.d.), thenH(X)=mH(X1).


Average codeword length per symbolThe average codeword length per symbol of a code f : Xm → Y∗ of anm-dimensional source X is

1mE

∣∣f(X)∣∣ = 1

m∑

x∈Xm

p(x)∣∣f(x)∣∣.

Shannon’s theorem:E∣∣f(X)

∣∣ ≥ H(X)

log2 s .

Corollary. If X1, . . . ,Xm are independent random variables distributed asX, then there exists a prefix code f : Xm → Y∗ such that

1mE

∣∣f(X)∣∣ < H(X)

log2 s +1m .


Optimal codesBinary case: s = 2.Theorem. If a prefix code f : X → {0, 1}∗ is optimal and the probabilitymasses of the symbols of X are given in descending order, that isp(x1) ≥ p(x2) ≥ · · · ≥ p(xn) > 0, then one may assume that f satisfies thefollowing properties.

a)∣∣f(x1)

∣∣ ≤ ∣∣f(x2)∣∣ ≤ · · · ≤

∣∣f(xn)∣∣, that is codeword lengths are order-

ed inversely with the probabilities.b)

∣∣f(xn−1)∣∣ = ∣∣f(xn)

∣∣, that is the two longest codewords have the samelength.

c) Two of the longest codewords differ only in the last bit (siblings) andcorrespond to the two least likely symbols.

Heuristics. a) If p(xk) > p(xj) and |f(xk)| > |f(xj)|, then swapping the codewords of xjand xk results in a code with shorter average codeword length. Thus, the original codecannot be optimal.b) If |f(xn−1)| < |f(xn)|, then by deleting the last bit of f(xn) we obtain a code withshorter average codeword length, which remains prefix.c) If there exists a codeword f(xi) such that f(xi) and f(xn) differ only in the last bit,then |f(xi)|= |f(xn−1)|= |f(xn)|. If i =n−1, then swap the codes of xi and xn−1. 2


Binary Huffman codeTheorem. Assume that the symbols {x1, x2, . . . , xn} of the source alpha-bet X are numbered so that p(x1)≥p(x2)≥· · ·≥p(xn)>0. Combine xn−1and xn into a new symbol xn−1 with probability p(xn−1)=p(xn−1)+p(xn)and consider the reduced source alphabet X = {x1, x2, . . . , xn−2, xn−1}.If g is an optimal prefix code of the reduced source alphabet X with dist-ribution

{p(x1),p(x2), . . . , p(xn−2), p(xn−1)+p(xn)

}then an optimal prefix

code f of the original source X with distribution{

p(x1), p(x2), . . . , p(xn)}

can be obtained by appending 0 and 1 to the codeword g(xn−1) and lea-ving the other codewords unchanged.

Example. Find the binary Huffman codes corresponding to the followingdistributions. Examine the deviation of the average codeword length fromthe theoretical lower bound.

1. P1 = {0.68, 0.17, 0.04, 0.04, 0.03, 0.03, 0.01}.2. P2 = {0.49, 0.14, 0.14, 0.07, 0.07, 0.04, 0.02, 0.02, 0.01}.3. P3 = {0.15, 0.15, 0.14, 0.14, 0.14, 0.14, 0.14}.


Binary Shannon-Fano codeAssume that the symbols of the source alphabet X = {x1, x2, . . . , xn} arenumbered so that p(x1)≥p(x2)≥· · ·≥p(xn)>0. Let

Li :=⌈− logs p(xi)

⌉, where ⌈a⌉ := min{n ∈ Z, n ≥ a},

and

w1 := 0, wi :=i−1∑ℓ=1

p(xℓ), i = 2, 3, . . . , n.

Let the codeword f(xi) of the source symbol xi be the binary representa-tion of

⌊2Liwi

⌋on Li bits, where ⌊a⌋ := max{n ∈ Z, n ≤ a}.

Theorem. The binary Shannon-Fano code is prefix and the average code-word length satisfies E(f) ≤ H(X ) + 1.Example. Find the binary Shannon-Fano codes corresponding to the fol-lowing distributions.

1. P1 = {0.68, 0.17, 0.04, 0.04, 0.03, 0.03, 0.01}.2. P2 = {0.49, 0.14, 0.14, 0.07, 0.07, 0.04, 0.02, 0.02, 0.01}.3. P3 = {0.15, 0.15, 0.14, 0.14, 0.14, 0.14, 0.14}.


Shannon entropyDefinition. The entropy of a discrete random variable X with range (al-phabet) X = {x1, x2, · · · , xn} is defined as

H(X) := −n∑

i=1p(xi) log2 p(xi).

Remarks. Entropy isthe amount of information required to determine the value of X;the level of uncertainty conveyed in the value of X.

Entropy has the same definition for random vectors X = (X1,X2, . . . ,Xr)⊤

with range X = {x1, x2, · · · , xn}, namely:

H(X) := −n∑

i=1p(xi) log2 p(xi).


Properties of entropyX ∈ X , Y ∈ Y: discrete random variables.Theorem.

a) If the number of elements in the range of X is n, then

0 ≤ H(X) ≤ log2 n.

On the left hand side inequality holds if and only if X is constant withprobability one, whereas the necessary and sufficient condition of hav-ing equality on the right hand side is that X is uniformly distributed,that is p(xi) =

1n , i = 1, 2, . . . , n.

b) For discrete random variables X and Y we have

H(X,Y) ≤ H(X) + H(Y),

and equality holds if and only if X and Y are independent.c) For any function g(X) of X we have

H(g(X)

)≤ H(X),

with equality if and only if g is one-to-one.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 155 / 239

Properties of conditional entropyTheorem. Let X,Y and Z be discrete random variables with finite ran-ges. Then

a)H(X,Y) = H(Y) + H(X|Y) = H(X) + H(Y|X).

b)0 ≤ H(X|Y) ≤ H(X).

On the left hand side equality holds if and only if X is uniquely deter-mined by Y with probability one, whereas the necessary and sufficientcondition of having equality on the right hand side is the independen-ce of X and Y.

c)H(X|Z,Y) ≤ H(X|Z),

with equality if and only ifp(x|z, y) = p(x|z)

for all x, y, z, where p(x, y, z) > 0.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 157 / 239

Properties of conditional entropy

d) For any function f(Y) of Y we have

H(X|Y) ≤ H(X∣∣f(Y)),

with equality if and only if for all fixed z and values x and y satis-fying f(y) = z and p(y) > 0

p(x|y) = P(X = x

∣∣f(Y) = z).

e) The joint entropy of random variables X1,X2, . . . ,Xn satisfies

H(X1,X2, . . . ,Xn) =H(X1) + H(X2|X1) + H(X3|X2,X1) + · · ·

· · ·+ H(Xn|Xn−1, . . . ,X1).


Mutual informationDefinition. The mutual information of discrete random variables X and Yis defined as

I(X;Y) := H(X) + H(Y)− H(X,Y).

Remark. Mutual information is symmetric and

I(X;Y) = H(X)− H(X|Y) = H(Y)− H(Y|X) = I(Y;X).

Remark.

I(X;Y) =∑x,y

p(x, y) log2p(x, y)

p(x)p(y)

=∑x,y

p(x, y) log2p(x|y)p(x) =

∑x,y

p(x, y) log2p(y|x)p(y) .


Properties of mutual information

Theorem. Let X and Y be discrete random variables.a) I(X;Y) ≥ 0 and I(X;Y) equals 0 if and only if X and Y are inde-

pendent.b) I(X;X) = H(X).c) I(X;Y) ≤ H(X) and I(X,Y) ≤ H(Y).d) For any functions f and g of X and Y, respectively, we have

I(X;Y) ≥ I(g(X); h(Y)

).

e) The following three statements are equivalent:i) I(X;Y) = H(X);ii) H(X|Y) = 0;iii) there exists a function g : R → R, such that P

(X = g(Y)

)= 1.


Noiseless channelsGiven an information channel which is able to transmit the symbols of thecode alphabet Y =

{y1, y2, . . . , ys}.

X: input signal at the entrance of the channel.Y: output signal at the exit of the channel, corresponding to X.

We assume that the channel is memoryless, that is Y depends only on X.

Output signal Y contains I(X,Y) information about the input value X.

Noiseless channel: X = Y, so we have I(X,Y) = H(X).

The maximal value of H(X) is log2 s, that is a single code symbol cancontain that much information. This amount of information can fully betransmitted through the channel. Thus, with a single symbol a noiselesschannel can transmit at most C = log2 s bits of information, which is theinformation capacity of the channel.


Channel capacityNoisy channel: X = Y, so I(X,Y) < H(X).The behaviour of the channel can be described with transition probabilities

pi|j := P{

Y = yi∣∣X = yj

}, i, j = 1, 2, . . . , s.

Distribution of the input signal X: qj := P(X = yj), j = 1, 2, . . . , s.

The channel capacity of a memoryless information channel isC := sup I(X,Y),

where the supremum is taken over all possible distributions of X.Example. Noiseless channel:

pi|j =

{1, i = j,0, i = j.

I(X,Y) = H(X) ≤ log2 s, with equality if and only if X is uniformly distributed, that isqj = 1/s, j = 1, 2, . . . , s. Capacity:

C = log2 s.


Memoryless binary symmetric channel

1

00

1

X

1−p

1−p

p

p

1−q

q

Y

Y = {0, 1}, that is s = 2.

Let p, q ∈ [0, 1].

Distribution of the input signal:

P(X = 1) = q, P(X = 0) = 1 − q.Transition probabilities:

p0|0 := P(Y = 0 | X = 0) = 1 − p, p1|0 := P(Y = 1 | X = 0) = p,p1|1 := P(Y = 1 | X = 1) = 1 − p, p0|1 := P(Y = 0 | X = 1) = p.

Capacity of the memoryless binary symmetric channel (BSCp):

C = 1 − H2(p), where H2(p) := −p log2 p − (1 − p) log2(1 − p).

Special cases:C = 1 ⇐⇒ H2(p) = 0 ⇐⇒ p = 0 or p = 1.

p = 0: noiseless; p = 1: binary inversion channel.

C = 1 ⇐⇒ H2(p) = 1 ⇐⇒ p = 1/2: random channel.


Information sourcesX: information source, infinite sequence X1,X2, . . . of random variables.At time point i the source takes value Xi.Each random variable has the same range (alphabet) X = {x1, x2, . . . , xn}.X is called memoryless, if random variables X1,X2, . . . are independent.X is called stationary, if X1,X2, . . . is stationary, that is for any positive in-tegers n and k, the joint distribution of X1,X2 . . . ,Xn coincides with thejoint distribution of the shifted random vector Xk+1,Xk+2, . . . ,Xk+n.X is called ergodic, if for any function f(x1, . . . , xk) we have

limn→∞

1n

n∑i=1

f(Xi, . . . ,Xi+k−1) = Ef(X1, . . . ,Xk) with probability one,

given the limit exists.Consider a uniquely decodable code f with alphabet Y={y1, y2, . . . , ys}.Block encoding with block length k ≥ 1.Aim: minimization of the average codeword length per source symbol L.


Source coding with variable lengthLet X be memoryless and stationary, symbol encoding. For the code of asource message of length k we have

L =1kE

(∣∣f(X1)∣∣+ · · ·+

∣∣f(Xk)∣∣) = E

∣∣f(X1)∣∣.

Shannon’s theorem: E∣∣f(X1)

∣∣ ≥ H(X1)

log2 s .

There exists a prefix code f such that: E∣∣f(X1)

∣∣ < H(X1)

log2 s + 1.

Block encoding f : X k → Y∗

L =1kE

(∣∣f(X1, . . . ,Xk)∣∣) ≥ 1

kH(X1, . . . ,Xk)

log2 s =independence

H(X1)

log2 s .

For any k there exists a prefix code f : X k → Y∗ with average codewordlength per symbol L satisfying

L <H(X1)

log2 s +1k .


Encoding of general information sourcesDefinition. The source entropy of the source X = X1,X2, . . . is definedas

H(X) = limn→∞

1nH(X1,X2, . . . ,Xn),

given the limit exists.Theorem. If the source X = X1,X2, . . . is stationary, then the sourceentropy exists and

H(X) = limn→∞

H(Xn|X1,X2, . . . ,Xn−1).

Theorem. The average codeword length per symbol L of a uniquely de-codable block code f : X k → Y∗ of a stationary source X = X1,X2, . . .satisfies inequality

L ≥ H(X)log2 s .

For a sufficiently large block length k there exists a uniquely decodablecode f with average codeword length per symbol arbitrary close to theabove lower bound.


Universal source codingCosts of data transmission for the previously studied (block) codes:

Fixed costs: e.g. relative frequencies of source symbols.Variable costs: the codewords corresponding to the source message.

Theoretically, sources are of infinite length, so the the proportion of thefixed cost vanishes.In practice, source messages have finite lengths. The fixed cost might evenbe higher, than the variable cost of the codewords.

Adaptive codes: the actual source symbols are encoded with the help ofthe preceding symbols.

Examples:Adaptive Huffman code.Lempel-Ziv algorithms (LZ77, LZ78, LZW).


LZ77 algorithm (sliding window Lempel-Ziv algorithmAbraham Lempel and Jakov Ziv (1977)LZ77 achieves compression by replacing repeated occurrences of data withreferences to a single copy of that data existing earlier in the uncompress-ed data stream.A sliding window of length ha is moved on the source message.

Parts of the sliding window:dictionary: contains the last hk previously coded source symbols;lookahead buffer: contains the next he source symbols to be coded.

Example. Source message

. . . cabracadabrarrarrad . . .

Sliding window: ha := 13, hk := 7, he := 6.

c a b r a c a d a b r a r r a r r a dSándor Baran Mathematics and Information Theory 2018/19, 2. sem. 169 / 239

LZ77 algorithmCoding:

1 Using a backward pointer the encoder finds in the dictionary the sym-bols which match the first symbol (after the cursor) of the lookaheadbuffer.

2 Checks the lengths of matching strings of the dictionary and the look-ahead buffer.

3 Finds the longest match.4 Output: a triple ⟨t, h, c⟩.

t: position relative to the cursor of the longest match that starts inthe dictionary. If no match is found, t = 0.h: length of the longest match. If no match is found, h = 0.c: code of the next symbol in the lookahead buffer beyond thelongest match.

5 Advances window by h + 1.


Example

Sliding window: ha := 13, hk := 7, he := 6.

c a b r a c a d a b r a r r a r r a d

d is not in the dictionary. Output: ⟨0, 0, f(d)⟩.


a in the dictionary: t = 2, h = 1, t = 4, h = 1 and t = 7, h = 4.Longest match: t = 7, h = 4. Output: ⟨7, 4, f(r)⟩


r in the dictionary: t = 1, h = 1 and t = 3, h = 5.Longest match: t = 3, h = 5. Output: ⟨3, 5, f(d)⟩


PropertiesEncoding of ⟨t, h, c⟩ using a fixed length binary code requires

⌈log2 hk⌉+ ⌈log2 he⌉+ ⌈log2 n⌉bits, where n is the size of the source alphabet.The efficiency of the encoding asymptotically (hk, he → ∞) equals that ofthe optimal algorithm, which requires the distribution of the source.For a stationary and ergodic source the limit as hk, he → ∞ of the averagecodeword length per symbol equals H(X)

log2 s .Modifications, increasing efficiency, pl.

Variable length code for the compression of ⟨t, h, c⟩, e.g. adaptiveHuffman code.Dual format: the output is either ⟨t, h⟩ or ⟨c⟩. Formats are indicatedby a flag bit.LZSS – Lempel-Ziv-Storer-SzymanskiDictionary and lookahead buffer of variable length.

Applications: pkzip, arjSándor Baran Mathematics and Information Theory 2018/19, 2. sem. 172 / 239

LZ78 algorithmBoth the compressor and the decompressor builds and maintains a dictio-nary from the previously appeared strings.

1 Starting from the cursor (actual position) the compressors finds thelongest match in the dictionary.

2 Output: ⟨i, c⟩.i: index of the dictionary entry of the match;c: code of the first non-matching character.If no match is found, the output will be ⟨0, c⟩.

3 Extends the dictionary with the concatenation of the dictionary entryi and the first non-matching character (having code c). Has an eofsymbol.

For a stationary and ergodic source the average codeword length per sym-bol converges to H(X)

log2 s .Problem: the dictionary quickly increases without limits.Solution: the use of a fixed dictionary after some time, or removal of therarely used or unnecessary entries.


ExampleSource message:

dabbacdabbacdabbacdabbacdeecdeecdee

output dictionary output dictionaryof compressor index entry of compressor index entry

⟨0, f(d)⟩ 1 d ⟨4, f(c)⟩ 10 bac⟨0, f(a)⟩ 2 a ⟨9, f(b)⟩ 11 dabb⟨0, f(b)⟩ 3 b ⟨8, f(d)⟩ 12 acd⟨3, f(a)⟩ 4 ba ⟨0, f(e)⟩ 13 e⟨0, f(c)⟩ 5 c ⟨13, f(c)⟩ 14 ec⟨1, f(a)⟩ 6 da ⟨1, f(e)⟩ 15 de⟨3, f(b)⟩ 7 bb ⟨14, f(d)⟩ 16 ecd⟨2, f(c)⟩ 8 ac ⟨13, f(e)⟩ 17 ee⟨6, f(b)⟩ 9 dab


LZW algorithm

An effective variant of LZ78. Terry Welch (1984)As a contrast to the pair ⟨i, c⟩ of LZ78, the output is just the dictionaryindex i. The dictionary must contain the complete source alphabet.

1 Starting from the cursor the compressor reads source symbols havinga match in the dictionary into a buffer p.Let c be the first symbol such that pc is not a dictionary entry

2 Output: index of dictionary entry c.3 Extends the dictionary with the string pc and continues the algorithm

from character c.

Application: compress command of UNIX, GIF format.Adaptive dictionary length. In case of compress 512 entries, after fillingthem 1024, etc. The upper bound can be specified up to 216 entries.


ExampleSource message:

dabbacdabbacdabbacdabbacdeecdeecdee

index entry output index entry output1 a 14 acd 102 b 15 dabb 123 c 16 bac 94 d 17 cda 115 e 18 abb 76 da 4 19 bacd 167 ab 1 20 de 48 bb 2 21 ee 59 ba 2 22 ec 510 ac 1 23 cde 1111 cd 3 24 eec 2112 dab 6 25 cdee 2313 bba 8 5


Example

Compress the following source messages:a) abbabbabbbaababa;b) “bed spreaders spread spreads on beds”, where space and eof are

separate symbols.Use

LZ77 algorithm with parameters hk = 7, he = 6;LZ78 algorithm;LZW algorithm.


QuantizationX = X1,X2, . . . : stationary source, Xi ∈ R absolutely continuous.Q : R → R: a function with discrete range, quantizer.Q(X1),Q(X2), . . . : quantized signal of X, sequence of discrete randomvariables. Source code with block length k = 1.Measure of distortion for a block of length n:

D(Q) :=1nE

( n∑i=1

(Xi −Q(Xi)

)2)

=stationarity

E(X −Q(X)

)2.

D(Q): mean-squared distortion of the quantizer Q.X: a random variable distributed as X1,X2, . . ..{x1, x2, . . . , xN}: range of the quantizer Q. Elements of Q are the levelsof quantization.Quantization regions: Bi :=

{x∈R : Q(x)=xi

}, i=1, 2, . . . ,N.

All definitions are valid also for discrete sources X.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 179 / 239

Optimal quantizerGiven the levels of quantization {x1, x2, . . . , xN}, the regions of the quanti-zer Q with smallest mean-squared distortion:

Bi ={

x : |x − xi| ≤ |x − xj|, j = 1, 2, . . .N}.

In case of equality x is assigned to the region with the smallest index. Thisis the nearest neighbour condition. We are dealing with quantizers satisfy-ing it.If x1 < x2 < · · · < xn, then the boundaries of quantization regions are:

yi =xi + xi+1

2 , i = 1, 2, . . . ,N − 1, that is

B1 =]−∞, y1 ], Bi =]yi−1, yi], i = 1, 2, . . . ,N − 1, BN =]yN−1,∞[.

f(x): PDF corresponding to the stationary source X.The optimal level corresponding to a given region Bi is the centroid of Bi:

xi =

∫Bi

xf(x)dx∫Bi

f(x)dx = E(X∣∣X ∈ Bi

).


Uniform quantizerThe mean-squared distortion of the quantizer

Q(x) = xi, if x ∈ Bi, i = 1, 2, . . . ,N,

equals

D(Q) =

∫ ∞

−∞

(x −Q(x)

)2f(x)dx =N∑

i=1

∫Bi

(x − xi)2f(x)dx.

[−A,A]: range of X, that is f(x) = 0, if x ∈ [−A,A].The N-level uniform quantizer:

QN(x) = −A + (2i − 1)AN , if

−A + 2(i − 1)AN < x ≤ −A + 2iAN , i = 1, 2, . . . ,N.

Explanation. Regions of QN are obtained by partitioning the interval[−A,A] into N equal parts. The levels are the midpoints of the intervals.


Non-uniform quantizersPrincipal idea: on regions with high probability mass a finer quantizationis used, more levels are assigned to this region.

Aim: for a given random variable (source) X find the levels of quantizati-on x1 < x2 < . . . < xN and quantizer Q minimizing D(Q).An optimal quantizer satisfies the following two necessary conditions(referred as Lloyd-Max condition).

1 Nearest neighbour condition:∣∣x −Q(x)∣∣ = min

1≤i≤N|x − xi|, ∀x ∈ R.

2 Centroid condition:Each level xj equals the mean (conditional expectation) of thosesample values Xi which are quantized to this particular level(Q(Xi) = xj).

A quantizer satisfying the above condition is called Lloyd-Max quantizer.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 182 / 239

Example

Not all Lloyd-Max quantizers are optimal.

Let X be uniformly distributed on {1, 2, 3, 4}. Possible 2-level Lloyd-Maxquantizers:

Q1(1) = 1; Q1(2) = Q1(3) = Q1(4) = 3;

Q2(4) = 4; Q2(1) = Q2(2) = Q2(3) = 2;

Q3(1) = Q3(2) = 1.5; Q3(3) = Q3(4) = 3.5.

Mean-squared distortions:

D(Q1) = D(Q2) = 0.5, D(Q3) = 0.25.

Only quantizer Q3 is optimal.


Lloyd-Max condition for absolutely continuous sourcesX: absolutely continuous random variable (source) with PDF f.Lloyd-Max condition:

1 Nearest neighbour condition:

y0 = −∞, yi =xi + xi+1

2 , i = 1, 2, . . . ,N − 1, yN = ∞,

where yi−1 and yi are the boundaries of the quantization regioncorresponding to xi.

2 Centroid condition:

xi =

∫ yiyi−1

xf(x)dx∫ yiyi−1

f(x)dx , i = 1, 2, . . . ,N.

Theorem (Fleischer, 1964). Let f(x) be log-concave that is log f(x) isconcave. Then there exists a unique N-level Lloyd-Max quantizer for f(x),which is in this way the optimal quantizer for f(x).


Lloyd-Max algorithm

Find the optimal quantization levels xi and the corresponding quantizationregions Bi =]yi−1, yi].Stuart P. Lloyd, Bell Laboratories, 1957 (published: 1982);Joel Max, General Telephone and Electronics Lab., Waltham, 1960.

Algorithm1 Choose an arbitrary set of starting levels x1 < x2 < · · · < xN.2 Determine the region boundaries yi according to the nearest neighbo-

ur condition, that is yi=xi+xi+1

2 , i=1, 2, . . . ,N−1.3 Choosing y0 = −∞ and yN = ∞ optimize the quantizer by finding

new levels according to centroid condition.4 Determine the change in the mean squared distortion. If it is below a

previously specified level stop, otherwise repeat steps 2. and 3.


Companding quantizersIn electronic devices usually uniform quantizers are implemented. However,in case of signals with large dynamic range (e.g. speech or audio signals)uniform quantization does not result in efficient coding.Using a strictly monotone increasing function called compressor, the sour-ce is transformed into the interval [−1, 1], then a uniform quantizer is app-lied. The method results in a non-uniform quantization.Quantized values are decoded, then with the help of the expander (inverseof the compressor) the original dynamic range is restored.Compander: compressor and expander.Applications:

Digital telephony systems, compressing before input to an analog-to-digital converter, and then expanding after a digital-to-analog conver-ter.Professional wireless microphones, as the dynamic range of the micro-phone audio signal itself is larger than the dynamic range provided byradio transmission.


Companders in speech coding8-bit Pulse Code Modulation (PCM) digital telephony systems.North-America and Japan: µ-law (µ = 255).

Gµ(x) = sign(x) log(1 + µ|x|)log(1 + µ)

, − 1 ≤ x ≤ 1.

G−1µ (x) = sign(x) 1

µ

((1 + µ)|x| − 1

), − 1 ≤ x ≤ 1.

Europe: A-law (A = 87.7 or A = 87.6)

GA(x) ={

sign(x) A|x|1+log A , 0 ≤ |x| < 1

A ;

sign(x) 1+log |Ax|1+log A , 1

A ≤ |x| ≤ 1.

G−1A (x) =

{sign(x) |x|(1+log A)

A , 0 ≤ |x| < 11+log A ;

sign(x) exp{|x|(1+log A)−1}A , 1

1+log A ≤ |x| ≤ 1.


µ-law and A-law

−0.1 −0.05 0 0.05 0.1

−0.4

−0.2

0

0.2

0.4

0.6

x

µ−law

A−law


Vector quantizationThe output of the source can be considered as a random vector. In thisway for a fixed mean squared distortion, one can reach better compressionrate than considering the coordinates separately, especially, when the coor-dinates are correlated.Example. The RGB values of a color image are quantized not separately,bus as an element of the 3D color space. Using 3 scalar quantizers thequantization regions are 3D bricks, whereas under vector quantization regi-ons of arbitrary shape can be considered.

X: d-dimensional source vector with PDF f(x).

Quantizer: Q : Rd → {x1, x2, . . . , xN}, xi ∈ Rd, i = 1, 2, . . . ,N.

Quantization regions: B1,B2, . . . ,BN. A partition of Rd, that is the setsB1,B2, . . . ,BN are disjoint and

∪Ni=1 Bi = Rd.

Q(x) = xi, if x ∈ Bi, i = 1, 2, . . . ,N.


Lloyd-Max condition for vector quantizersMean squared distortion:

D(Q) =1dE

∥∥X −Q(X)∥∥2

=1d

N∑i=1

∥x − xi∥2f(x)dx.

Lloyd-Max condition:1 Nearest neighbour condition: regions of Rd are Voronoi regions, that

isBi =

{x : ∥x − xi∥ ≤ ∥x − xj∥, ∀j = i

}.

2 Centroid condition:

xi = arg miny

∫Bi

∥x − y∥2f(x)dx,

that is the output vectors are centroids of the corresponding regions.Natural generalization of the Lloyd-Max algorithm: Linde-Buzo-Gray algo-rithm.


SamplingX(t): square integrable signal.{X(kT), k = 0,±1,±2, . . .}: sample of X(t) with sampling period T > 0.Reconstruction of X(t):

X(t) :=∞∑

k=−∞X(kT) sinc

( tT − k

), t ∈ R,

wheresinc(t) := sin(πt)

πt , t ∈ R.

Remark. sinc(0) = 1 and sinc(k) = 0, k = 0,±1,±2, . . . , that is fort = kT one has X(t) = X(t).

Problem: Under what conditions can the signal X(t) be fully reconstruct-ed from the sample X(kT) (X(t) = X(t), t ∈ R)?


Nyquist-Shannon sampling theoremTheorem. If a square integrable signal X(t) is bandlimited to W > 0,that is the Fourier transform X(ω) of X(t) equals 0 for |ω| > W, then

X(t) is continuous at any point t ∈ R,X(t) can be completely restored from a sample with period T, that isfor all t ∈ R we have

X(t) = X(t) :=∞∑

k=−∞X(kT) sinc

( tT − k

),

ifT <

π

W .

Remark. If the band frequency is W′ = W2π , then the sampling frequency

should be at least the double of W′ (Nyquist frequency).Example. In telephony signal is bandlimited to 3400 Hz and the samplingfequency is 8000 Hz.For CD quality audio signal is bandlimited to 20 kHz and the sampling fre-quency is 44100 Hz.


Transform coding1. The source signal is divided into blocks and a reversible transformationis applied to each block resulting in the corresponding transform coeffici-ents.2. Transform coefficients are quantized.3. The quantized transform coefficients are coded using a binary code.x = (x0, x1, . . . , xk−1)⊤: source block to be transformed.y = (y0, y1, . . . , yk−1)⊤: transformed coefficients.A = (ai,j): k × k dimensional orthonormal transform matrix.Forward and inverse transform:

y = Ax and x = By, where B = A−1 = A⊤.

In two-dimensional case (image compression) both the source block andthe transform coefficients are matrices (X and Y):

Y = AXA⊤ and X = A⊤YA.

Example. In case of JPEG compression 8 × 8 pixel blocks are used.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 194 / 239

Special transformations1. Discrete Cosine Transform (DCT) Ahmet, Natarajan, Rao (1974)Entries of A:

a1,j =1√k, ai,j =

√2k cos

((2j−1)(i−1)π

2k

), i, j = 1, 2, . . . , k.

The most popular transformation. Applications: JPEG, MPEG.2. Discrete Walsh-Hadamard Transform (DWHT, 1923) Jacques Hada-mard, Joseph L. Walsh.Entries of A are obtained using recursion:

A2k =

[A2k−1 A2k−1

A2k−1 −A2k−1

], where A1 = 1.

A2kA⊤2k = 2kI2k , so the transform matrix is 1√

2k A2k .Applications: JPEG XR (JPEG extended range, 2009; Microsoft HD pho-to) and MPEG-4 AVC or H.264 (MPEG-4 Part 10 Advanced Video Co-ding, 2003; e.g. Blue-Ray disks).


Subband Coding1. Source is passed through a filter bank which divides its spectrum intosome number of subbands (analysis filters). E.g. one can take M subbandsof equal width.2. In order to keep synchronized, output of filters are then subsampledaccording to the ratio of input and output bandwidths (decimation ordownsampling). E.g. one can keep each Mth sample value.3. The subsampled signals are separately coded and transmitted (com-pressed).4. Encoded samples from each subband are decoded and the decoded va-lues are then upsampled by inserting an appropriate number of zeros bet-ween samples.5. Upsampled signals are passed through a filter bank (synthesis filters).Output of reconstruction filters are added to give a final output.

Human sensory organs are very sensitive on frequencies. More importantfrequencies should be reconstructed more precisely, whereas less importantones can have larger distortion.


ExampleInput of the filter bank: (x1, x2, . . . , xn).Outputs: (y1, y2, . . . , yn) and (z1, z2, . . . , zn), where assuming x0 = 0

yi =xi + xi−1

2 ; zi = xi − yi =xi − xi−1

2 , i = 1, 2, . . . , n.

Both output sequences are smoother (have smaller dynamic range) thanthe original signal. Thus, they can be compressed with smaller distortion.The output is doubled.Downsampling: transmit only signals with even indices, that is y2i and z2i.Synthesis:

x2i = y2i + z2i, x2i−1 = y2i − z2i.

y1, y2, . . . - 2 ↓ -y2, y4, . . . 2 ↑ -y2, 0, y4, 0, y6, . . . -+ y2+z2, y4+z4, . . .

Uz1, z2, . . . - 2 ↓ -z2, z4, . . . 2 ↑ -z2, 0, z4, 0, z6, 0, . . . -− y2−z2, y4−z4, . . .

�


Delta codingA special case of predictive coding. It is advantageous if the differencebetween subsequent signal values are small, e.g. in case of digital imagesprovided we are not close to an edge.

Example. 8 subsequent pixel values of an 8-bit intensity image:147, 145, 141, 146, 149, 147, 143, 145.

Fixed bit length encoding on 8 bits: 64 bits.Differences (except the first value):

147, −2, −4, 5, 3, −2, −4, 2.The largest absolute difference is 5, which can be stored using 3 bits.To encode the differences it suffices to use 4 bits (3 + 1 for the sign).8 bits are used to store the length of the binary representation of the diffe-rences.Length of delta encoding: 8 + 8 + 7 · 4 = 44 bits. 31% gain.Lossless compression.


Lossy compression, exampleOutput of the source:

5.4, 10.1, 7.2, 4.6, 6.9, 12.5, 6.2, 5.3.

Differences:

5.4, 4.7, −2.9, −2.6, 2.3, 5.6, −6.3, −0.9.

Uniform quantizer with 7 levels: −6, −4, −2, 0, 2, 4, 6.Quantized values:

6, 4, −2, −2, 2, 6, −6, 0.

Restored values:6, 10, 8, 6, 8, 14, 8, 8.

Errors:−0.6, 0.1, −0.8, −1.4, −1.1, −1.5, −1.8, −2.7.

Longer sequences may result in even larger errors.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 199 / 239

Quantization errors{xn}, {xn}: input and reconstructed signal, respectively.{dn}: sequence of differences, dn = xn − xn−1.Sequence of quantized differences: dn = Q(dn) = dn + qn.Reconstruction: x0 = x0 and xn = xn−1 + dn.

d1 = x1 − x0; d1 = Q(d1) = d1 + q1;

x1 = x0 + d1 = x0 + d1 + q1 = x1 + q1;

d2 = x2 − x1; d2 = Q(d2) = d2 + q2;

x2 = x1 + d2 = x1 + q1 + d2 + q2 = x2 + q1 + q2;

xn = xn +n∑

k=1qk.

Quantizaton errors are accumulated.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 200 / 239

Predictive codingAt the nth step the encoder knows the previously restored value xn−1.Modified differences: dn = xn − xn−1.

d1 = x1 − x0; d1 = Q(d1) = d1 + q1;

x1 = x0 + d1 = x0 + d1 + q1 = x1 + q1;

d2 = x2 − x1; d2 = Q(d2) = d2 + q2;

x2 = x1 + d2 = x1 + d2 + q2 = x2 + q2;

xn = xn + qn.

Aim: to keep the differences dn as small as possible.The value of xn is approximated with a function of the previously reconst-ructed signal values pn = f(xn−1, xn−2, . . . , x0) called predictor.

dn = xn − pn = xn − f(xn−1, xn−2, . . . , x0).

The method is called differential pulse code modulation (DPCM).Patent: C. Chapin Cutler, Bell Laboratories, 1950.


DPCM-xn + -dn quantizer -dn

pn −

pn + +

xn

?

�predictor

6

-

encoder

-dn + -xn

pn+

predictor �

6

decoder

Input signals might change their character: adaptive DPCM.1. Adaptation to the input signal xn of the encoder: forward-adaptivemethod. The decoder does not know the signal xn, the new decoding pa-rameters have to be transferred.2. Adaptation to the output signal xn: backward-adaptive method. Boththe encoder and the decoder knows its value.Quantization can also be adaptive. Forward-adaptive case: the source isdivided into blocks and the parameters of the optimal quantizer for eachblock are transferred.


Jayant quantizerBackward-adaptive quantizer. Nikil S. Jayant, Bell Laboratories, 1973.If current input falls in the inner levels (close to the origin), contract stepsize, otherwise expand it.Each quantization interval (level) has a multiplier, which is less than 1 ininner levels and greater than 1 in outer levels. The multipliers are sym-metric to the origin.Mk: multiplier of the kth level.Outer level: Mk > 1; inner level: Mk < 1.∆n: step size of the quantizer at time n (the input is xn).If xn−1 falls in the reqion ℓ(n − 1), then step size adaptation

∆n = Mℓ(n−1)∆n−1.

Due to finite precision arithmetic, one has to specify ∆min and ∆max.

Example. Multipliers of a quantizer with 8 levels (3-bit quantizer):M1 = M8 = 1.2, M2 = M7 = 1, M3 = M6 = 0.9, M4 = M5 = 0.8.


Output levels of a 3-bit Jayant quantizer6

-−3∆ −2∆ −∆

∆ 2∆ 3∆

−∆/2

−3∆/2

−5∆/2

−7∆/2

∆/2

3∆/2

5∆/2

7∆/2Output

Input

1

2

3

4

5

6

7

8

Multipliers are symmetric: M1 = M8, M2 = M7, M3 = M6, M4 = M5.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 204 / 239

ExampleInner levels: M4 = M5 = 0.8, M3 = M6 = 0.9.Outer levels: M2 = M7 = 1, M1 = M8 = 1.2.Initial step size: ∆0 = 0.5.Input: 0.1, −0.2, 0.2, 0.1, −0.3, 0.1, 0.2, 0.5, 0.9, 1.5, 1.0, 0.9.

Quantization process:

n ∆n Input Output level Output Error Step size update0 0.5 0.1 5 0.25 0.15 ∆1 = M5 ×∆01 0.4 −0.2 4 −0.2 0.0 ∆2 = M4 ×∆12 0.32 0.2 5 0.16 0.04 ∆3 = M5 ×∆23 0.256 0.1 5 0.128 0.028 ∆4 = M5 ×∆34 0.2048 −0.3 3 −0.3072 −0.072 ∆5 = M3 ×∆45 0.1843 0.1 5 0.0922 −0.0078 ∆6 = M5 ×∆56 0.1475 0.2 6 0.2212 0.0212 ∆7 = M6 ×∆67 0.1328 0.5 8 0.4646 −0.0354 ∆8 = M8 ×∆78 0.1594 0.9 8 0.5578 −0.3422 ∆9 = M8 ×∆89 0.1913 1.5 8 0.6696 −0.8304 ∆10 = M8 ×∆910 0.2296 1.0 8 0.8036 0.1964 ∆11 = M8 ×∆1011 0.2755 0.9 8 0.9643 0.0643 ∆12 = M8 ×∆11


Delta modulationFor high frequency samples of a continuous source the difference betweenneighbouring sample values is small.Delta modulation (DM): DPCM with 2 level (1 bit) quantizer.In order to decrease distortion, the sampling frequency is increased even upto the centuple of the band frequency.Output values: ±∆. Fixed step size ∆: linear delta modulation.Problem: for flat input the output oscillates (granular noise), whereassteeply increasing output cannot be followed (overload noise).

Source: John Edward Abate: Linear and adaptive delta modulation. PhD thesis, Newark Collegeof Engineering, 1967.


Adaptive delta modulationJohn Edward Abate, AT&T and Bell Laboratories, 1967.Nearly flat output: small step size ∆; fast changes: large ∆.sn: DM “step” at the nth time point, sn = ±∆n.Step size updating based on one step:

∆n+1 =

{M1∆n, if sign sn = sign sn−1;M2∆n, if sign sn = sign sn−1;

1 < M1 =1

M2< 2.

Source: John Edward Abate: Linear and adaptive delta modulation. PhD thesis, Newark Collegeof Engineering, 1967.


Continuous variable slope delta modulationJohannes Anton Griefkes, Karel Riemens, Philips, 1970.Continuous variable slope delta modulation (CVSD):

∆n = β∆n−1 + αn∆0.

β: constant, slightly less than 1;αn ∈ {0, 1}: αn = 1, if J of the previous K outputs of the quantizer haveequal signs, otherwise αn = 0. Typical values: J = K = 3.Encodes at 1 bit per sample, e.g. audio sampled at 16 kHz is encoded at16 kbit/s.

Applications:16 and 32 kbit/s CVSD: military TRI-TAC digital telephones.16 kbit/s: US Army; 32 kbit/s: US Air Force.64 kbit/s CVSD: telephone-related bluetooth (e.g. wireless headsets,communication between mobile phones).


Predictorspn= f(xn−1, xn−2, . . . , x0): predictor of a predictive encoder (e.g. DPCM).Aim: find the optimal function f minimizing the the mean squared error

σ2d = E(Xn − pn)

2.

In the general case the problem is very complicated.For a fine enough quantization xn ≈ Xn, that is one can consider

pn = f(Xn−1,Xn−2, . . . ,X0).

σ2d is minimal if

f(Xn−1,Xn−2, . . . ,X0) = E(Xn|Xn−1,Xn−2, . . . ,X0),

however, this requires the knowledge of the corresponding conditional dist-ributions.Conditional distributions are usually not known. In case of a normally dist-ributed source, the conditional expectation is a linear function of the valu-es Xn−1,Xn−2, . . . ,X0.


Linear predictionLinear predictor of order N: pn :=

∑Ni=1 aixn−i.

For a fine enough quantization one has to minimize:

σ2d = E

(Xn −

N∑i=1

aiXn−i

)2.

R(k) = E(XnXn+k): autocovariance function of a centered weekly statio-nary source Xk (constant mean, autocovariances depend only on the lag).From equations ∂

∂ajσ2

d = 0, j = 1, 2, . . . ,N, we have:

N∑i=1

aiR(i − j) = R(j), j = 1, 2, . . . ,N.

Solution on the above system of equations results in the coefficients of thepredictor.Problem: the Wiener-Hopf equation to be solved has been derived understationarity assumption, which might hold only locally.


Adaptive predictorForward adaptive case: input is divided into blocks.Speech encoding: blocks of length 16 ms. 8000 Hz sampling results in128 sample values per block.Image compression: 8 × 8 pixel blocks.Sample autocovariance of the ℓth block of length M:

R(ℓ)(k) = 1M − |k|

ℓM−|k|∑i=(ℓ−1)M+1

XiXi+|k|, R(ℓ)(−k) = R(ℓ)(k).

Input has to be buffered, which adds a delay to the system. As the decoderdoes not know the input signal, it requires some additional information.Backward adaptive case: using the output signal of the encoder a recursiveformula for minimization of

d2n =

(Xn −

N∑i=1

aixn−i

)2.


Waveform based speech compressionReference method: pulse code modulation (PCM) codec.Analog speech signal with a bandwidth limited to 300 to 3400 Hz issampled with rate 8000 Hz and quantized using an 8-bit quantizer.Transmission bit rate: 8000 × 8 = 64 kbit/s.ITU-T (International Telecommunication Union – TelecommunicationStandardization Sector) G.711 telecommunication standard (1972): PCMcoding with companded quantizer (A-law or µ-law).Adaptive DPCM (ADPCM): utilizes the correlation between the differentvoice samples (Jayant, Bell Laboratories, 1974)Quantization: 5, 4, 3, 2 bits; bit rate: 40, 32, 24, 16 kbit/s.ITU-T G.726 telecommunication standard (speech codec, 1990): superse-des both G.721 (32 kbit/s, 1984) and G.723 (24 and 40 kbit/s, 1988)standards.Most commonly used mode: 32 kbit/s, standard codec in DECT (digitalenhanced cordless telecommunications) wireless phone systems (e.g. Pana-sonic KX-TG1100).


Speech formationAir from lungs push through the vocal tract (consisting of the laryngeal ca-vity, pharyx, oral and nasal cavities) out of the mouth to produce a sound.The vocal tract modulates the voice produced by the vocal cords. In spe-ech generators a generated signal is modulated.

Waveform for the word “Decision”: Source: Sun, L., Mkwawa, I.-H., Jammeh, E., Ifeachor, E.Guide to Voice and Video over IP. Springer, 2013 (p. 21., fig. 2.3).


Voiced and unvoiced soundsVoiced sounds: all vowels and e.g. consonants b, d, g, j, v, z. Vocal cordsvibrate (open and close) at a given frequency (fundamental frequency,pitch frequency) and the speech samples show a quasi-periodic pattern.

Sample of voiced speech. Source: Sun et al. (2013); p. 22., fig. 2.4.

Unvoiced sounds: e.g. f, k, p, s, t, ch. Vocal cords do not vibrate, remainopen during the sound production. The waveform is more like noise.

Sample of unvoiced speech. Source: Sun et al. (2013); p. 23., fig. 2.5.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 215 / 239

Parametric compression codingSpeech signal is stationary as the shape of the vocal tract is stable in shortperiod of time (around 20 ms).In a stationary segment (frame) the vocal tract can be modeled by a filter.The encoder analyzes the different speech segments:

classifies whether the speech segment is voiced or unvoiced;determines the parameters of the voice generating filter;estimates the gain (energy) of the speech excitation signal and forvoiced segments the pitch frequency (males: ≈ 125 Hz; females:≈ 250 Hz).

The parameters are coded into a binary bit stream and transmitted to thedecoder. The decoder will produce its excitation signal and reconstruct thespeech (carry out speech synthesis) based on the received parameters.

Parametric encoders are more complex than waveform based ones.The quality of parametric based speech codecs is low, with mechanicsound but fair intelligibility.Have very low transmission bit rate: 1.2 − 4.8 kbit/s.


Linear prediction codingLinear prediction coding (LPC), Bishnu S. Atal, Bell Labs, 1971. Uses ap-order linear filter.εn: excitation signal. Voiced case: periodic with a given frequency; unvo-iced case: white noise (random, independent, stationary).G: gain of the signal.xn: output speech signal.

xn =

p∑i=1

aixn−i + Gεn.pitch period, T

?

periodpulse train

voiced

•

-• × - -vocal tract(time varying filter)

6

energy (G)

εn xn

LPC coefficients (ai)?

6666

white noise

unvoiced

•R


LPC-10 algorithmDepartment of Defense, USA. Federal Standard (FS) 1015 (1984). Mainlyused in radio communications with secure voice transmissions.Length of a segment (frame): 22.5 ms. Information to be transmitted: 54bits per frame.

Type of excitation (voiced/unvoiced): 1 bit.Pitch frequency (period): 6 bits (quantizer with logarithmic compan-ding).Filter parameters: 41 bits. Sensitive on the errors of parameter valuesaround 1. Instead of a1 and a2 values gi = (1+ai)/(1−ai), i = 1, 2,are quantized.

▶ Voiced case: 10-order predictive filter. Uniform quantizer,g1, g2, a3, a4: 5 bits; a5, . . . , a8: 4 bits; a9: 3 bits, a10: 2 bits.

▶ Unvoiced case: 4-order predictive filter. Uniform quantizer,g1, g2, a3, a4: 5 bits; error correction: 21 bits.

Energy: 5 bits (quantizer with logarithmic companding).Synchronization: 1 bit.

Bit rate: 54 bits/22 ms = 2.4 kbit/s. Compression ratio is 26.7 comparedwith 64 kbit/s PCM. Enhanced variants may achieve 800 bit/s.


Code excited linear predictionAnalysis-by-synthesis (AbS): a synthesizer is included at the encoder side.A closed-loop search is carried out in order to find the best match excita-tion signal, that is the one minimizing the error between the original andthe synthesized speech signal. The parameters of this excitation signal aretransferred to the decoder.Code excited linear prediction (CELP), Manfred R. Schroeder and BishnuS. Atal, 1985.Optimal excitation signal is chosen from a code book with a size of 256 to1024, the index of the chosen signal is sent to the decoder.Fair quality speech transmission on a bit rate of 4.8 kbit/s.Slow search in the code book. Large code book is split into smaller ones.Original algorithm (Schroeder and Atal, 1983) implemented on a Cray-1supercomputer (80 MFLOPS; DE HPC: 254 TFLOPS): coding of 1s spe-ech signal took 150s.Standards: ITU-T G.728 (16 kbit/s), G.729 (8 kbit/s)Application: part of RealAudio and MPEG-4 Audio formats.


Audio compressionCD quality: bandwidth limited to 20 kHz, 44100 Hz sampling frequency,16-bit uniform quantizer (e.g. WAV: waveform audio file format, 1991).Transmission bit rate: 44100 × 16 × 2 ≈ 1400 kbit/s (×2: stereo).

Factors should be taken into account during audio compression.Frequencies between 2 and 4 kHz are the easiest to perceive. As thefrequencies change towards the ends of the audible bandwidth, thevolume must also be increased to detect them. Hence, in this regioneven a larger distortion is more tolerated.A high intensity dominant sound on a given frequency makes inaudible(masks out) the neighbouring frequencies (simultaneous masking).A high intensity dominant sound on a given frequency masks out theweaker sounds in neighbourig frequencies which are present immedia-tely preceding (≈ 2 ms) or following (≈ 15 ms) it (temporal masking).

The audio signal to be compressed is analyzed in frequency domain.Not all frequency components are transferred and for the transferred onesquantizers with different distortions are applied.


MPEG-2 Audio Layer III (MP3) compression, IInput signal: uncompressed PCM audio (e.g. WAV file), divided intoblocks (frames) of 1152 samples.Two processes start simultaneously.1a. The samples of the given frame are filtered into 32 equally spaced

frequency subbands. For a 44.1 kHz sampling rate each subband willbe approximately 22050/32 = 689 Hz wide.

1b. Fast Fourier transform: input signals are transformed from time do-main to frequency domain.

2a. Subband signals are windowed to reduce the artifacts caused by theedges of the time-limited signal segment. MPEG standard uses 4 win-dow types. After windowing, by applying a modified discrete cosinetransform (MDCT) each of the 32 subbands is split into 18 finer sub-bands resulting in a total of 576 frequency lines.

2b. Psychoacoustic model: models human sound perception. Provides in-formation about which part of the audio signal can be omitted due tomasking, which window types the MDCT should apply, how to quan-tize the different frequency lines.


MPEG-2 Audio Layer III (MP3) compression, IIThe two parallel processes already being in the frequency domain arejoined.

3. Based on the information provided by the psychoacoustic model the576 frequency lines form 22 bands and the different bands are quan-tized with different scale factors. Masking also takes place here.

4. Huffman encoding: the quantized values are Huffman coded. Using aconstant bit rate (CBR) each block will have the same code length,whereas with variable bit rate (VBR), some blocks will have shortercodes and the unused bytes are passed to the next block.

5. Coding of side information: codes all parameters generated by the en-coder which should be transmitted to the decoder.

6. Multiplexer: generates the bit stream representing the 1152 encodedPCM samples. The frame header, CRC (cyclic redundancy code) co-deword (error correction), side information and Huffman coded frequ-ency lines are put together to form a transferable frame.


MP3 encoding scheme

Source: Rassol Raissi, The theory behind mp3. www.mp3-tech.orgSándor Baran Mathematics and Information Theory 2018/19, 2. sem. 223 / 239

Visual perceptionColor and brightness perception: cone cells (cones) and rod cells.Rod cells: role in peripheral vision, a key function in night vision.Cone cells: three types corresponding to three different light wavelengthranges: (s(λ): small; m(λ): medium; ℓ(λ): large).Tristimulus vector corresponding to light with spectral density L(λ):

(S,M, L)⊤ =

∫ (s(λ),m(λ), ℓ(λ)

)⊤L(λ)dλ.This response results in the brightnessand hueMetameric colors: colors with diffe-rent spectral distributions resulting inmatching S, M and L valuesOne cannot distinguish them.


Color spaceSeparation of luminance and chromaticity:

(X,Y,Z)⊤ = M(S,M, L)⊤.Y: luminance (brightness); X and Z: chromaticity (hue and saturation).M: a linear transform (3 × 3 matrix) always resulting in a vector withnon-negative components.In practice, instead of (X,Y,Z)⊤ one uses

(Y, x, y)⊤, where x =X

X + Y + Z , y =Y

X + Y + Z .

Representation of chromaticity: (x, y) color space chromaticity diagram.If (xi, yi) corresponds to spectral density Li(λ), i = 1, 2, then combinationµ1L1(λ) + µ2L2(λ), µ1, µ2 > 0, will be represented by a point on thesection between (x1, y1) and (x2, y2).Monochromatic (spectrum) color: consists of a single wavelength of lightλ0. They form a curve giving the boundary of the horseshoe like chroma-ticity diagram of (x, y) values.


Chromaticity diagram

International Commission on Illumination (CIE: Commission internationale de l’éclaira-ge), 1931.


RGB color spaceDefined by the three chromaticities of the red (R), green (G), and blue (B)additive primaries.Can produce any chromaticity in the triangle defined by the primary colors.(R′,G′,B′): ratio of primary colors on the integer scale 0 − 255 obtainedafter gamma correction.

Various RGB color spaces (left); color calibration of an LG 42LB731V smart TV (right).Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 228 / 239

YCbCr color space(Y, x, y): separates luminance and chromaticity.Problem: human visual perception is not uniform in Y coordinate.Y′

CbCr

=

16128128

+1

256

65.728 129.057 25.064−37.945 −74.494 112.439112.439 −94.154 −18.285

R′

G′

B′

.

Y′: luma component. Grayscale copy of the image. Perceived uniformly.Cb, Cr: blue-difference and red-difference chroma components.All coordinates are on the integer scale 0 − 255.ITU-R BT.601 SDTV standard (formerly CCIR 601, 1982).


Graphics Interchange Format (GIF)Colors used in an image have their RGB values defined in a palette table.Data for the image refer to the colors by their indices in the table.GIF: a maximum of 8-bit palette (256 colors) from the 24-bit RGB colorspace (3 × 8 bit). CompuServe, 1987.Horizontal scan from top left, lossless LZW compression.Main applications: compression of icons, simple graphics. Properties:

Small number of colors.Lots of large, monocrome areas and repeated patterns. Can be effici-ently compressed using the LZW algorithm.

Problem: non-applicable for compression of photographs.


Joint Photographic Experts Group (JPEG), ILossy compression (1993). Input image: 24-bit YCbCr color space.Image is split into 3 channels according to the coordinates. Each channelis compressed separately. As humans can see considerably more fine detailin the brightness of an image than in the hue and color saturation, the re-duction of spatial resolutions of Cb and Cr is allowed (downsampling). Ra-tios: 4 : 4 : 4 (no downsampling), 4 : 2 : 2 (horizontal reduction by a factor2), 4 : 2 : 0 (horizontal and vertical reduction by a factor 2).

1. Each channel is split into 8 × 8 blocks. If the image size is not a mul-tiple of 8, it is extended by repeating the last column/row.

2. Two dimensional DCT is applied to each 8 × 8 block in order to con-vert them into the frequency domain. Elements of the transformedblocks: harmonics corresponding to different frequencies. Upper leftcorner: low frequencies where human eye is more sensitive to differen-ces. (0, 0) entry: DC component (basic hue for the block). Differentblocks often have similar DC components.

3. DC component is compressed using delta coding with respect to theDC component of the preceding block.


Joint Photographic Experts Group (JPEG), II

4. The various harmonics are quantized uniformly, however, using diffe-rent quantization steps. The more sensitive the human vision, thefiner quantization is used.Quantization matrix: entries specify the quantization steps. Dependson the compression rate. Higher compression rate requires larger valu-es. E.g. for a 50% compression the proposed quantization matrix:

Q =

16 11 10 16 24 40 51 6112 12 14 19 26 58 60 5514 13 16 24 40 57 69 5614 17 22 29 51 87 80 6218 22 37 56 68 109 103 7724 35 55 64 81 104 113 9249 64 78 87 103 121 120 10172 92 95 98 102 100 103 99


Example

Input matrix :

187 130 113 31 19 125 69 112170 52 52 162 207 206 149 51188 129 48 160 228 15 185 9236 25 26 210 166 217 105 24638 149 210 189 198 200 45 30114 237 37 222 49 193 168 236186 115 251 183 197 22 43 8763 112 216 135 47 139 130 22

DCT matrix :

1021.7 16.6 −104.4 24.3 43.5 21.8 0.2 −57.2−40.4 −61.6 74.2 149.4 73.5 −4.0 −18.8 13.9−83.2 137.5 94.3 6.4 −85.3 67.3 56.8 82.129.1 47.7 74.0 −32.9 −56.8 −61.2 52.7 −100.2−86.7 −40.2 −46.0 −40.5 −114.0 −30.3 121.6 −42.8−13.0 −1.0 96.6 −76.6 113.9 −20.8 17.1 33.1−2.3 −20.4 157.5 −26.4 −49.9 7.5 −102.6 −72.7−25.4 198.6 −71.4 −27.9 −13.1 −16.5 −14.5 168.8

Quantized DCT :

1024 22 −100 32 48 40 0 −61−36 −60 70 152 78 0 0 0−84 143 96 0 −80 57 69 5628 51 66 −29 −51 −87 80 −124−90 −44 −37 −56 −136 0 103 −77−24 0 110 −64 81 0 0 00 0 156 0 0 0 −120 −1010 184 −95 0 0 0 0 198


Joint Photographic Experts Group (JPEG), III5. The differences of the DC components and the DCT values are rearr-

anged in a zigzag order.6. The obtained sequences are split into sequences (runs) starting with a

sequence of zeros followed by a single non-zero element. The run-length code of a run is

({n, s}, ν

).

n: number of zeros before the non-zero element;s: number of bits required to represent the non-zero element;ν: (signed) bit representation of the non-zero element.

7. Pairs {n, s} are encoded using either Huffman or arithmetic encoding,their codes are followed by the concatenation of bit representations ν.

Example.81, 0, 0, 0, 0, 0, −6, 0, 0, 0, −12, 0, 0, . . .Run-length code:

({0, 8}, 01010001

);(

{5, 4}, 1001);({3, 5}, 10011

).

Bit representations: 01010001100110011.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 234 / 239

PropertiesStandards: ISO/IEC 10918, ITU-T T.81, T.83, T.84, T.86Efficient, if there are no contrasting edges. For high compression quantiza-tion results in artifacts caused by the noise around contrasting edges.

Source: http://www.gimp.org/tutorials/GIMP_Quickies/Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 235 / 239

Lossless JPEGExtension to the JPEG format. Joint Photographic Experts Group, 1993.Two-dimensional predictive encoding (DPCM).Horizontal scan from top left. Prediction Xi,j of the value Xi,j of pixel (i, j)using values Xi−1,j, Xi,j−1 and Xi−1,j−1.Eight different prediction schemes:

0 : Xi,j = 0; 4 : Xi,j = Xi−1,j + Xi,j−1 − Xi−1,j−1;

1 : Xi,j = Xi−1,j; 5 : Xi,j = Xi,j−1 +Xi−1,j − Xi−1,j−1

2 ;

2 : Xi,j = Xi,j−1; 6 : Xi,j = Xi−1,j +Xi,j−1 − Xi−1,j−1

2 ;

3 : Xi,j = Xi−1,j−1; 7 : Xi,j =Xi−1,j + Xi,j−1

2 .

Any one of the eight predictors can be used, however, the same for the en-tire image.Adaptive arithmetic or Huffman encoding.Compression rate in the predictive case (all schemes but 0): around 2 : 1.


Moving Picture Experts Group (MPEG)Work group generating specifications for the International Organization forStandardization (ISO) and International Electrotechnical Commission(IEC). Standards: MPEG-1, MPEG-2, MPEG-4, MPEG-7, MPEG-21.Lossy video compression (1993). MPEG-1 video layers:

video sequence −→ group of pictures −→ frames −→makroblocks −→ blocks

Group of pictures (GOP): independently encoded unit.Three types of frames:

I-frame (intra frame): independent picture, JPEG compression;P-frame (predictive coded frame): encoded with the help of the pre-vious I- or P-frame;B-frame (bidirectionally predictive coded frame): encoded with thehelp of the previous/subsequent I- or P-frame.

Macroblock: a set of 6 blocks with a resolution of 16 × 16 pixels. Blockresolution: 8 × 8 pixels. 4 luma (Y′) and a pair of downsampled croma(Cr,Cb) blocks. P- and B-frames are encoded by macroblocks.


Frame reorderingDuring encoding B-frames moveforward to be encoded after theneighbouring I- and P-frames.Simplifies buffering.

Source and display order: 0 1 2 3 4 5 6 7 8 9Frame type: I B B P B B P B B IOrder in the coded bit stream: 0 2 3 1 5 6 4 8 9 7

Typical pattern: two P-frames are encoded using a single I-frame and ha-ving two B-frames between them.I-frames: encoded independently. High-speed seeking through an MPEG-1video is only possible to the nearest I-frame.Compression of P- and B-frames is based on macroblock motion estimati-on.


PredictionsP-frames: to each macroblock the encoder finds the most matching mac-roblock of the previous I- or P-frame (reference macroblock). Only themotion vector (distance and direction) and the difference from the referen-ce block (residual error) is encoded.Motion vector: Huffman code; error: JPEG-like encoding. No referenceblock: JPEG encoding.B-frames: searching for matches in the previous or subsequent I- or P-fra-mes. If the match is bidirectional, the error with respect to the mean ofmatching frames and the two motion vectors will be encoded. For a unidi-rectional match the algorithm is similar to the compression of P-frames.MPEG-1 compression for a 356 × 260 resolution and 24-bit color space:

Type Size RateI 18 Kb 7 : 1P 6 Kb 20 : 1B 2.5 Kb 50 : 1

Average 4.6 Kb 27 : 1

Video bit rate for 30 frame/s: 1.2 Mbit/s; with audio: 1.45 Mbit/s.Sándor Baran Mathematics and Information Theory 2018/19, 2. sem. 239 / 239

mathematics and information theory for engineers [5mm] …thomas m. cover and joy a. thomas:...

Documents