applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf ·...

16
Applications of the wavelet transform in image processing Øyvind Ryan Department of informatics, University of Oslo e–mail: oyvindry@ifi.uio.no * 12 nov 2004 Abstract Mathematical methods applied in the most recent image formats are presented. First of all, the application of the wavelet transform in JPEG2000 is gone through. JPEG2000 is a standard established by the same group which created the widely used JPEG standard, and it was established to solve some of the shortcomings of JPEG. Also presented are other recently established image formats having wavelet transforms as part of the codec. Other components in modern image compression systems are also gone through, together with the mathematical and statistical methods used. 1 Prelimiaries All image formats gone through here use an image transform, quantization and coding. All of these are described for the different formats in question. Transforms mentioned below can be separably extended to two dimensions for applications to image processing. Therefore, we state our results in one dimension only for the sake of simplicity. We will describe this separation process later. We use value m for block dimension (or more precisely, the number of channels), for our wavelet transforms we will always have m = 2. We will associate a transform with m filters, so that for m =2 we will have only two filters. With respect to signal processing, these are to be interpreted as low-pass high-pass filters. If we denote the signal by x, the block dimension (or number of chan- nels) describes the size of the block partitioning of the signal. JPEG applies only block transforms, meaning that if the signal is split into blocks x = x[i] (each x[i] a vector of dimension m), the transformed signal y = y[i] is given by y[i]= A * x[i] for some m × m matrix A. This way we lose influence from different blocks, for instance for pixels on block boundaries. This is what gives rise to the blockiness artifact in JPEG. Important block transforms are sketched below. 1 Sponsored by the Norwegian Research Council, project nr. 160130/V30 1

Upload: vuongdiep

Post on 28-Mar-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

Applications of the wavelet transform in image

processing

Øyvind RyanDepartment of informatics, University of Oslo

e–mail: [email protected]

12 nov 2004

Abstract

Mathematical methods applied in the most recent image formats arepresented. First of all, the application of the wavelet transform in JPEG2000is gone through. JPEG2000 is a standard established by the same groupwhich created the widely used JPEG standard, and it was established tosolve some of the shortcomings of JPEG. Also presented are other recentlyestablished image formats having wavelet transforms as part of the codec.Other components in modern image compression systems are also gonethrough, together with the mathematical and statistical methods used.

1 Prelimiaries

All image formats gone through here use an image transform, quantizationand coding. All of these are described for the different formats in question.Transforms mentioned below can be separably extended to two dimensionsfor applications to image processing. Therefore, we state our results in onedimension only for the sake of simplicity. We will describe this separationprocess later. We use value m for block dimension (or more precisely,the number of channels), for our wavelet transforms we will always havem = 2. We will associate a transform with m filters, so that for m = 2we will have only two filters. With respect to signal processing, these areto be interpreted as low-pass high-pass filters.

If we denote the signal by x, the block dimension (or number of chan-nels) describes the size of the block partitioning of the signal. JPEGapplies only block transforms, meaning that if the signal is split intoblocks x = x[i] (each x[i] a vector of dimension m), the transformedsignal y = y[i] is given by

y[i] = A∗x[i]

for some m × m matrix A. This way we lose influence from differentblocks, for instance for pixels on block boundaries. This is what givesrise to the blockiness artifact in JPEG. Important block transforms aresketched below.

1Sponsored by the Norwegian Research Council, project nr. 160130/V30

1

Page 2: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

1.1 KLT (Karhunen-Loeve Transform)

KLT is the unique transform which decorellates its input. To be precise,define the covariance matric CX of a random vector X by

CX = E ((X − µx)(X − µx)∗) .

If the KLT transform is called K, then the random vector Y = K∗Xshould have uncorrellated components, i.e. CY = K∗CXK is a diagonalmatrix. This transform is what gives rise to principal component analysis(PCA). Among linear transforms, KLT minimizes MSE when keeping agiven number of it’s principal components (when principal componentsare ranked in decreasing order).

The drawback for the KLT is that we need to recompute the transformeach time the statistics of the source changes. By it’s nature, it cannotbe separable either.

1.2 DFT (Discrete Fourier Transform

The DFT is defined by the transform A =(ei 2π

mpq

)p,q

, which is unitary.

DFT has an efficient implementation through the FFT. It is also separable.One drawback of DFT is that the transform works badly when the

end points (x0 and xm−1) are far apart. If the full Fourier transform wasapplied in this case, many higher Fourier components would be introducedto compensate for this.

1.3 DCT

DCT is defined by

A =(cq cos

(2πfq

(p+

1

2

)))q,p

; fq =q

2m,

where

cq =

{ √1m

if q = 0√2m

if q 6= 0.

DCT can be constructed through DFT by symmetrically extending theinput sequence about the last point, applying the 2m point DFT, andrecovering the first m points afterwards. It is separable since the DFT is,and the FFT can be used as an efficient implementation of the DCT. Noneed to adapt the transform to the statistics of the source material (aswith KLT). DCT is robust approximation to the KLT for natural imagesources. Used in JPEG, MPEG, CCITT H.261.

One drawback is that there is no way to use the DCT for losslesscompression, since outputs of the transform are not integers.

2 JPEG (baseline)

2.1 Transform

DCT is used as the transform of the input signal, after the input has beenlevel shifted. If elements in the input signal is 8 bits, level shift wouldmean subtracting 128 from numbers in [0, 256〉, producing numbers in[−128, 128〉. Block dimension is always 8.

2

Page 3: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

2.2 Quantization

Uniform midtread quantization (midtread means that 0 is the midpoint ofthe quantization interval about 0) is used. A quantization table, consisit-ing of the step sizes of each coefficient quantizer (the table has size 8×8).This table is emitted with the data itself. Use smaller values in this tableif you want less loss during encoding. The coefficients after quantizationsare called labels. The first label in a block is called an DC coefficient, therest are called AC coefficients. Higher AC coefficients typically roundedto 0. This provides us with good compression.

2.3 Coding

AC and DC coefficients are coded differently. Differences between suc-cessive DC labels are coded, instead of the DC label itself. This differ-ence is not coded for AC labels. Labels are partitioned into categories:0,−1, 1,−3,−2, 2, 3, ..., of sizes 20, 21, 22, ..., numbered 0, 1, 2, 3, , .... Cat-egory numbers are Huffman coded. Coding is done in zig zag scan order.Each label is coded with first the Huffman code of the category number,followed by the value within the category (number of bits required equalsthe category number). Zig zag scan order ensures that many coeffiecientsare zero near end of traversal, these are skipped with an end of block code.

Drawbacks and lacks in generality of the JPEG standard are

1. blockiness, due to splitting image into independent parts,

2. blocks are always processed sequentially (no way to obtain othertypes of progression),

3. lossy version only, since quantization is always performed on floatingpoint values.

3 JPEG2000

3.1 Wavelet transform basics

We follow the introduction to subband transforms and wavelets used in [4].A lighter introduction with more examples can be found in [5]. Goingthrough these two references together can be very helpful. Instead ofapplying a block transform, one can attempt with a transform where oneblock influences many other (surrounding) blocks. This may reduce theblockiness, even if the transformed signal at the end is partitioned intoindependent blocks anyway. We will consider a subband transform as ourcandidate to such a transform:

Definition 1 An (analysis) subband transform for the set of m×m ma-trices A[i]i∈Z is defined by

y[n] =∑i∈Z

A∗[i]x[n− i].

Definition 2 A (synthesis) subband transform for the set of m×m ma-trices S[i]i∈Z is defined by

x[n] =∑i∈Z

S[i]y[n− i].

3

Page 4: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

� -m

� -m

� -

� -

y[n− 2] y[n− 1] y[n] y[n + 1]

x[n− 2] x[n− 1] x[n] x[n + 1]

y[k]

x[k]

���HHH

@@

@@

@@

@@

BBBBBBBB

��

��

��

��

��

��

��

��

A[1] A[0] A[−1]

Figure 1: One dimensional convolutional transform.

These definitions can be thought of as one dimensional convolutionaltransforms, as shown in figure 3.1. The analysis transform produces atransformed signal from an input signal, while the synthesis transformshould recover the input signal from the transformed signal. We say thatwe have perfect reconstruction if there exists a synthesis transform exactlyinverting the analysis transform. JPEG2000 applies subband transformswith only two channels (m = 2), as opposed to JPEG’s block transformwith eight channels. So, artifact like blocking may be removed when usingsubband transforms, even if the number of channels is decreased.

3.2 Expressing transforms with filter banks

One can write

y[nm+ q] = (x ? hq)[mn], 0 ≤ q < m, (1)

where the filter bank {hq}0≤q<m is defined by hq[mi − j] = (A∗[i])q,j .This expresses the analysis operation through filter banks.

One can also write

x =

m−1∑q=0

(yq ? gq),

where the filter bank {gq}0≤q<m is defined by gq[mi+ j] = (S[i])j,q, andwhere

yq[i] =

{yq[

im

] if m divides i0 otherwise

This expresses the synthesis operation through filter banks. When con-structing subband transforms from wavelets, we will construct the trans-form by first finding a filter bank from the scaling function of the wavelet.

3.3 Expressing transforms in terms of vectors

Let 〈·, ·〉 be the inner product in `2(Z). One can write yq[n] = 〈x, a(n)q 〉,

whereaq[k] = h∗q [−k],

a(n)q [k] = aq[k −mn]

are the analysis vectors. This expresses the analysis operation in terms ofthe analysis vectors.

One can also write x =∑m−1

q=0

∑nyq[n]s

(n)q , where

sq[k] = gq[k]

4

Page 5: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

s(n)q [k] = sq[k −mn]

are the synthesis vectors. This expresses the synthesis operation in termsof the analysis vectors.

3.4 Orthonormal and biorthogonal transforms

Definition 3 An orthonormal subband transform is a transform for whichthe synthesis vectors are orthonormal.

It is easy to show that for orthonormal subband transforms, the analysisand synthesis vectors are equal (sq = aq∀q), and the analysis and synthesismatrices are reversed version of one another A[i] = S[−i]. Orthonormalsubband transform are the natural extension of orthonormal (unitary)block transforms.

If the analysis system is given by filters h0, h1, and the synthesis systemis given by filters g0, g1, one can calculate the end-to-end transfer functionof analysis combined with synthesis. In order to avoid aliasing, one willfind that

h0(ω + π)g0(ω) + h1(ω + π)g1(ω) = 0, (2)

and if we in addition want perfect reconstruction,

h0(ω)g0(ω) + h1(ω)g1(ω) = 2 (3)

Example 1 Let’s take a look at a popular definition of orthonormal sub-band transforms through filter banks. This is an alternative definition ofwhat is called Quadrature Mirror Filters (QMF). Given a low-pass proto-type f ,

h0[k] = g0[−k] = f [k]

h1[k] = g1[−k] = (−1)k+1f(−(k + 1)).

Note thatg1[n] = (−1)n+1g0[−(n− 1)], (4)

or g1(ω) = e−iω g0(ω + π)∗ in the Fourier domain. Note also that

h1[n] = (−1)n+1h0[−(n+ 1)], (5)

or h1(ω) = eiωh0(ω + π)∗

in the Fourier domain. These relations will beused when we construct biorthogonal transforms below.

It is not hard to see that the alias condition( 2) is satisfied, and thatperfect reconstruction 3 is satisfied if∣∣h0(ω)

∣∣2 +∣∣h0(ω ± π)

∣∣2 = 2. (6)

It is also not hard to show that the {h0[i − 2n]}n, {h1[i − 2n]}n are or-thonormal and gives rise to an orthonormal subband transform if f = h0

satsified this.

Example 2 Lapped orthogonal transform with cosine modulated filter bank:Analysis vectors are defined by

aq[k] = cq[k] =

{1√m

cos(2πfq

(k − m−1

2

))if −m ≤ k < m

0 otherwise

where the cosine frequencies are fq =q+ 1

22m

; 0 ≤ q < m. It is easy to verifythat these analysis vectors give rise to an orthonormal subband transform,

5

Page 6: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

where all analysis matrices are 0, except for A[0] and A[1]. Such trans-forms are called lapped transforms.

One can get a more general family of lapped transforms by defining

aq[k] = cq[k]w[k],

where the windowing sequence w satisfies

w[k] = w[−1− k]; 0 ≤ k < mw2[k] + w2[m− 1− k] = 2; 0 ≤ k < m

One can show that any window sequence satisfying these assumptions givesrise to a new lapped orthonormal transform. These transforms work well,and by choosing a windowing sequence wisely, one can obtain very goodfrequency discrimination between the subbands.

The only thing we miss in the above example is linear phase (linearphase means that the filter sequence is symmetric or anti-symmetric aboutsome point). Linear phase will ensure that filter applications will preservethe support of the filter, which is a very nice property to use in an imple-mentation. As it turns out, we can’t get this in addition to orthonormality:One can show that there exists no two channel (m = 2) nontrivial, linearphase, finitely supported orthonormal subband transforms. We thereforeextend our transforms to the following class.

Definition 4 A biorthogonal subband transform is a transform for which

〈s(n1)q1 , a(n2)

q2 〉 = δ[q1 − q2]δ[n1 − n2], 0 ≤ q1, q2 < m,n1, n2 ∈ Z (7)

Contrary to the case for orthonormal transforms, there exists two channelnontrivial, linear phase, finitely supported biorthogonal subband trans-forms. Biorthogonal transforms are important in image compression alsobecause they may approximate orthonormal transforms well. It is nothard to see that biorthogonality is equivalent with perfect reconstruction.

Example 3 We will construct biorthogonal subband transforms in the fol-lowing way: We start with filters h0, g0, and construct filters h1, g1 usingequations 5 and 4. To get a biorthogonal transform, we must constructh0, g0 jointly so that equation 7 is satisfied.

Alias cancellation and perfect reconstruction in this case reduce to

h0(ω + π)g0(ω) = h0(ω)∗g0(ω + π)∗,

h0(ω)g0(ω) + h0(ω + π)∗g0(ω + π)∗ = 2.

These equations are normally specified another way, in which it uses func-tions associated with wavelets: (m0(ω) = 1√

2g0(ω), ˆm0(ω) = 1√

2h0(−ω)).

3.5 Multi-resolution analysis (MRA)

We turn now to the concept of constructing biorthogonal/orthonormaltransforms from wavelets.

Definition 5 A multi-resolution analysis (MRA) on L2(R) is a set ofsub-spaces

· · · ⊂ V(2) ⊂ V(1) ⊂ V(0) ⊂ V(−1) ⊂ V(−2) ⊂ · · ·

satisfying the following properties.(MR-1)

⋃m∈Z

V(m) = L2(R).

6

Page 7: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

(MR-2)⋂

m∈ZV(m) = {0}.

(MR-3) x(t) ∈ V(0) ⇐⇒ x(2−mt) ∈ V(m).(MR-4) x(t) ∈ V(0) ⇐⇒ x(t− n) ∈ V(0).(MR-5) There exists an orthonormal basis {φn}n∈Z , for V(0) such thatφn(t) = φ(t− n). The function φ(t) is called the scaling function.

Since V(0) ⊂ V(−1), MR-3 and MR-4 shows that we can write

φ(t) =√

2

∞∑n=−∞

g0[n]φ(2t− n)

for some sequence g0. g0 is to be thought of as a low-pass prototype. Thisequation is called the two-scale equation. From the MR properties onecan deduce from the two scale equation that the vectors {g0[i− 2n]}n areorthonormal. From example 1, we can associate it with a function f , andobtain an orthonormal subband transform this way. the high-pass filterobtained in this way, g1, can be used to construct a function

ψ(t) =√

2

∞∑n=−∞

g1[n]φ(2t− n).

This function is called the mother wavelet, and has the nice property thatit’s translated dilations ψ

(m)n (t) = ψ(2−mt−n) are orthonormal functions

spanning L2(R).

3.5.1 Interpretation of MRA in image processing

MRA has the following interpretations with respect to image processing.The input signal is represented as an element in V(0), by putting thecomponents in the signal as coefficients for the translates of the scalingfunction:

x(t) =∑n∈Z

y(0)0 [n]φ(t− n).

Define W(m) by the span of the {ψ(m)n (t)}n. It is not difficult to show

that

1. V(m) and W(m) are orthogonal subspaces,

2. V(m) = V(m+1) ⊕W(m+1),

3. the coefficients in such a decomposition can be obtained by filteringwith h0 and h1 respectively.

Note that point 1 and 2 implies that V(0) = ⊕i>0W(i). We need to explainpoint 3 further. Equation 1 produces, through filtering with h0, h1, twosequences (i.e. the two polyphase components of the transformed signal)

from an input signal. We let y(0)0 be the input signal, and let the two

sequences produced be y(1)0 and y

(1)1 . Then one can show (by also using

the two-scale equation)∑n∈Z

y(0)0 [n]φ(t− n)︸ ︷︷ ︸∈V(0)

=∑n∈Z

y(1)0 [n]φ(1)

n (t)︸ ︷︷ ︸∈V(1)

+∑n∈Z

y(1)1 [n]ψ(1)

n (t)︸ ︷︷ ︸∈W(1)

,

which explains point 3. This can be done iteratively, by writing∑n∈Z

y(1)0 [n]φ(1)

n (t)︸ ︷︷ ︸∈V(1)

=∑n∈Z

y(2)0 [n]φ(2)

n (t)︸ ︷︷ ︸∈V(2)

+∑n∈Z

y(2)1 [n]ψ(2)

n (t)︸ ︷︷ ︸∈W(2)

,

7

Page 8: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

and so on. Therefore, we obtain wavelet coefficients (i.e. coefficients inW(m)) by iterative applications of the filters h0, h1.

The interpretation of the wavelet suspaces W(i) in image processing isin terms of resolution: Wavelet subspaces at higher indices can be thoughtof as image content at lower resolution, and the subspace V(m) for high mcan be thought of as a base for obtaining a low resolution approximationof the image. If one writes

V(0) = V(2) ⊕W(2) ⊕W(1),

and decomposes a signal x = y(2)0 + y

(2)1 + y

(1)1 into components in these

subspaces, one has in addition two approximations to the signal: y(2)0 +y

(2)1

and y(2)0 , where the first one is a better approximation than the last one.

One can view these approximations as versions of the signal with higherfrequencies dropped, since the coefficients are obtained through filteringwith h0, which can be viewed as a low-pass filter. The effect of droppinghigh frequencies in the approximation can be seen especially at sharpedges in an image. These get more blurred, since they can’t be representedexactly at lower frequencies.

Transforms in image processing are two-dimensional, so we need a fewcomments on how we implement a separable transform. When a two-dimensional transform is separable, we can calculate it by applying thecorresponding one-dimensional transform to the columns first, and thento the rows. When filtering, we have four possibilities

1. low-pass filter to rows, followed by low-pass filter to columns (LL-coefficients)

2. low-pass filter to rows, followed by high-pass filter to columns (HL-coefficients)

3. high-pass filter to rows, followed by low-pass filter to columns (LH-coefficients)

4. high-pass filter to rows, followed by high-pass filter to columns (HH-coefficients)

When a separable transform is applied, only the LL-coefficients may needfurther decomposition. When this decomposition is done at many levels,we get the subband decomposition in figure 3.5.1. A similar type of decom-position is sketched for FBI fingerprint compression in section 3.12. Thewavelet subspace decomposition in two dimensions has a similar forms:

V(m) = V(m+1) ⊕W(m+1)0,1 ⊕W(m+1)

1,0 ⊕W(m+1)1,1 ,

and the mother wavelet basis functions are expressed in terms of thesynthesis filters by

ψ0,1(s1, s2) = 2∑

n1,n2∈Z

g0[n1]g1[n2]φ(2s1 − n1, 2s2 − n2),

ψ1,0(s1, s2) = 2∑

n1,n2∈Z

g1[n1]g0[n2]φ(2s1 − n1, 2s2 − n2),

ψ1,1(s1, s2) = 2∑

n1,n2∈Z

g1[n1]g1[n2]φ(2s1 − n1, 2s2 − n2).

8

Page 9: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

LH1 HH1

HL1

LH2 HH2

HL2LH3HH3

HL3LL3

Figure 2: Passband structure for a two dimensional subband transform withthree levels.

Example 4 With three different resolutions, our subband decompositioncan be written

V(0) = V(2) ⊕W(2)0,1 ⊕W

(2)1,0 ⊕W

(2)1,1 ⊕W

(1)0,1 ⊕W

(1)1,0 ⊕W

(1)1,1

= LL2 ⊕HL2 ⊕ LH2 ⊕HH2 ⊕HL1 ⊕ LH1 ⊕HH1.

Contributions from these subspaces appear in the same order as abovein a JPEG2000 file, and by including more of the subspaces results inhigher resolution. We demonstrate this here with a computer generatedfile. Figures 3 through 9 show file sizes and images at all decompositionlevels in this case. Also shown graphically are the subbands which aredropped, these are blacked out. We see that we gradually lose resolutionwhen dropping more and more wavelet subband spaces, but that even at thelowest resolution the image is recognizable, even if the file size is reducedfrom 105 kb to 17kb. This is a very nice property for usage in web browsers.Note that these files sizes are calculated by replacing the contents of thesubbands dropped with zeroes. They are close to number of bytes if thesubbands were dropped in their entirety.

If we can find a wavelet with nice properties, most wavelet coefficients areclose to 0, and can thus be dropped in a lossy compression scheme.

3.5.2 Biorthogonal wavelets

Given that orthonormality in (MR-5) is replaced with linear indepen-dence, we can follow the same reasoning as for orthonormality to createa mother wavelet function. The wavelet coefficient subspaces will notbe orthogonal in this case. When iterating the filters g0, g1, we decom-pose into subspaces spanned by scaling function and mother wavelet φ, ψ.When we iterate the filters h0, h1, we decompose into subspaces spannedby dual scaling function φ and dual mother wavelet ψ. Scaling and dualscaling functions differ only in the biorthogonal case. We can deduce abiorthogonal subband transform under appropriate conditions. Expressedwith mother wavelets, the criteria for constructing a biorthogonal waveletbecomes

〈ψ(m)n , ψ

(m)n 〉 = δ[n− n]δ[m− m],∀n,m, n, m

9

Page 10: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

Figure 3: File with no loss, i.e. all wavelet subband spaces included. It’s size is105kb

Figure 4: We then remove the W(1)1,1 -coefficients. It’s size is 94kb.

10

Page 11: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

Figure 5: We then remove the W(1)1,0 -coefficients also. It’s size is 84kb.

Figure 6: We then remove the W(1)0,1 -coefficients also. It’s size is 73kb.

11

Page 12: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

Figure 7: We then remove the W(2)1,1 -coefficients also. It’s size is 57kb.

Figure 8: We then remove the W(1)1,0 -coefficients also. It’s size is 37kb.

12

Page 13: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

Figure 9: Finally, we remove the W(1)0,1 -coefficients also. It’s size is 17kb.

Since we assume linear phase, we will work with biorthogonal waveletsfrom now on.

Not all biorthogonal subband transforms and orthonormal transformsgive rise to wavelets. In order for a filter bank to give rise to a wavelet,one can show that m0 (defined in example 3) must have a zero at π, andthat the number of zeroes affects the smoothness of the scaling function:More zeroes means a smoother scaling function.

Daubechies found all FIR filters with N zeroes at π which give riseto orthonormal wavelets. Using similar calculations, Cohen, Daubechies,and Feaveau [3] found FIR filters of odd length, linear phase and withdelay-normalized transforms with N, N zeroes at π (These must both beeven), which give rise to biorthogonal wavelets. These are:

m0(ω) =(cos

ω

2

)N

p0 (cos(ω)) , and ˆm0(ω) =(cos

ω

2

)N

p0 (cos(ω))

where p0(x)p0(x) is an arbitrary factorization of the polynomial (set M =N+N

2)

M−1∑n=0

(M + n− 1

n

)(1

2− x

)n

.

The factorization of the polynomial P (x) is not completely arbitrary, sincewe must group complex conjugate roots together to get real-valued filtercoefficients.

Example 5 If we set p0(x) ≡ 1, we obtain biorthogonal wavelets withfilter banks consisting of dyadic fractions only. If we in addition setSet N = N = 2, we obtain the Spline 5/3 transform. This is used byJPEG2000 for lossless compression. One can show that definition 1 sim-plifies to

y[n] =√

2

((− 1

8− 1

4

0 − 18

)x[n+ 1] +

(34

− 14

14

34

)x[n] +

(− 1

80

14

− 18

)x[n− 1]

)

13

Page 14: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

in this case. Similarly, definition 2 simplifies to

x[n] =√

2

((0 014

0

)y[n+ 1] +

(12

− 14

14

12

)y[n] +

(0 − 1

4

0 0

)y[n− 1]

)Example 6 If we split the zeroes at π as equally as we can, and thefactors in p0(x) and p0(x) equally also, and also set N = N = 4, we obtainthe wavelet JPEG2000 uses for lossy compression. This is the CFD 9/7transform. For the lossless transform above, only the A[−1], A[0], A[1],S[−1], S[0], S[1] were nonzero. For this lossy transform, only the A[−2],A[−1], A[0], A[1], A[2], S[−2], S[−1], S[0], S[1], S[2] are nonzero.

5/3 and 9/7 above referr to the number nonzero coefficients in thecorresponding filters.

3.6 Transform

Wavelet-transform, with different wavelet kernels. Transform may beskipped in it’s entirety (typically done so in lossless compression). Trans-form may also be applied fully or only partially. DWT. Has efficientimplementation, both in lossy and lossless case.

3.7 Quantization

Deadzone scalar quantization is the quantization method used by JPEG2000.Extensions to the standard opens up for another quantization scheme.

3.8 Coding

MQ codingImage split into tiles (not smaller blocks as in JPEG). Typical size

512×512. Each tile is decomposed into constituent parts, using particularfilter banks. Demands:

1. Spatial random access into bitstream

2. Distortion scalable bitstream

3. progression scalability

4. resolution scalability

3.9 Applications to video compression

MPEG4

3.10 Applications to speech recognition

PCA (Principal component analysis) is a common technique in specchrecognition.

3.11 Applications to face recognition

Elastic bunch graph matching. Gabor wavelets.

14

Page 15: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

Figure 10: Decomposition structure employed in the FBI fingerprint compres-sion standard.

3.12 Applications to fingerprint compression andmatching

FBI uses it’s own standard [1] for compressing fingerprint images. Com-pression algorithms with wavelet-based transformations were selected incompetition with compression using fractal transformations. FBI’s stan-dard has similarities with the JPEG2000 standard, and especially with anextension to the JPEG2000 standard. It uses another subband decompo-sition, this is demonstrated in figure /reffig:fingerprintdecomp.

Further decomposition of the LH-, HL- and HH-bands like this mayimprove compression somewhat, since the effect of the filter bank applica-tion may be thought of as an ”approximative orthonormalization process”.The extension to the JPEG2000 standard also opens up for this type ofmore general subband decompositions. In FBI’s standard we may alsouse many different wavelets, with the coefficients of the correspondingfilter banks signalled in the code-stream. The only constraint on the fil-ters is that there should be no more than 32 nonzero coefficients. Thisis much longer than lossy compression in JPEG2000 (9 nonzero coeffi-cients). This may be necessary, since fingerprint images are images withmuch more sharp edges than most natural images. FBI has their own setof filters which they recommend. The JPEG2000 extension opens up foruser-definable wavelet kernels also. The coding is done differently withFBI’s standard. They use Huffman coding, with tables calculated on a perimage basis. It turns out to be impossible (at least yet) to find a losslesscompression algorithm for fingerprint images with compression ratio morethan 2 : 1. Similar phenomena can be observed with JPEG2000: If an im-age with many sharp edges is wavelet-transformed, the compressed datamay be larger compared to when the image is NOT wavelet-transformed(many small coefficients are obtained after wavelet transformation, andwe would obtain compression in the lossy case only, since these would bequantized to 0 for lossy transformation). JPEG2000 solves this by notmaking the wavelet transform mandatory: It can be applied down to agiven level only, or skipped altogether.

15

Page 16: Applications of the wavelet transform in image processingfolk.uio.no/oyvindry/talk-wavelets.pdf · Applications of the wavelet transform in image processing ... project nr. 160130/V30

3.13 Other applications of wavelets

Blending of multiple images [2]. If several lighting sources are combined,the result may be obtained by combining a set of basis images. Thecombination can be done very fast in the wavelet domain.

References

[1] WSQ Gray-scale Fingerprint Image Compression Specification.

[2] I. Drori D. Lischinski. Fast multiresolution image operations in thewavelet domain. IEEE Transactions on Visualization and ComputerGraphics., 9(3):395–412, 2003.

[3] A. Cohen I. Daubechies J.-C. Feauveau. Biorthogonal bases of com-pactly supported wavelets. Communications on Pure and Appl. Math.,45(5):485–560, June 1992.

[4] David S. Taubman Michael W. Marcellin. JPEG2000. Image com-pression. Fundamentals, standards and practice. Kluwer AcademicPublishers, 2002.

[5] Khalid Sayood. Introduction to Data Compression. Academic Press,2000.

16