coherent multiscale image processing using dual-tree ...wailam/research/qwt_ip2007.pdf · coherent...

0

Coherent Multiscale Image Processingusing Dual-Tree Quaternion Wavelets

Wai Lam Chan, Hyeokho Choi, Richard G. Baraniuk *

Department of Electrical and Computer Engineering, Rice University

6100 S. Main St., Houston, TX 77005, USA

Submitted to IEEE Transactions on Image Processing, October 2006

Revised, November 2007

EDICS Numbers: MRP-WAVL, MDE-TRNS

Abstract

The dual-tree quaternion wavelet transform (QWT) is a new multiscale analysis tool for geometric

image features. The QWT is a near shift-invariant tight frame representation whose coefficients sport

a magnitude and three phases: two phases encode local image shifts while the third contains image

texture information. The QWT is based on an alternative theory for the 2-D Hilbert transform and can

be computed using a dual-tree filter bank with linear computational complexity. To demonstrate the

properties of the QWT’s coherent magnitude/phase representation, we develop an efficient and accurate

procedure for estimating the local geometrical structure of an image. We also develop a new multiscale

algorithm for estimating the disparity between a pair of images that is promising for image registration

and flow estimation applications. The algorithm features multiscale phase unwrapping, linear complexity,

and sub-pixel estimation accuracy.

Email: [email protected], [email protected]. Web: www.dsp.rice.edu. Phone: 713.348.5132. Fax: 713.348.5685. This workwas

supported by NSF grant CCF–0431150, ONR grant N00014-02-1-0353, AFOSR grant FA9550-04–1-0148, AFRL grant FA8650-

05-1850, and the Texas Instruments Leadership University Program.

I. INTRODUCTION

The encoding and estimation of the relative locations of image features play an important role in

many image processing applications, ranging from feature detection and target recognition to image

compression. In edge detection, for example, the goal is to locate object boundaries in an image. In image

denoising or compression, state-of-the-art techniques achieve significant performance improvements by

exploiting information on the relative locations of large transform coefficients [1]–[4].

An efficient way to compute and represent relative location information in signals is through the

phaseof the Fourier transform. The Fourier shift theorem provides a simple linear relationship between

the signal shift and the Fourier phase. When only a local region of the signal is of interest, the short-time

Fourier transform (STFT) provides a local Fourier phase foreach windowed portion of the signal. The

use of the Fourier phase to decipher the relative locations of image features is well established in the

image processing and computer vision communities for applications such as stereo matching and image

registration [5]–[7]. Indeed, the classic experiment of Lim and Oppenheim [8] demonstrated that for

natural images the Fourier phase contains a wealth of information beyond the magnitude. By the Fourier

shift theorem, estimating location information using local phase provides more robust estimates with sub-

pixel accuracy and requires less computational effort compared to purely amplitude-based approaches.

For signals containing isolated singularities, such as piecewise smooth functions, the discrete wavelet

transform (DWT) has proven to be more efficient than the STFT.The locality and zooming properties of

the wavelet basis functions lead to a sparse representationof such signals that compacts the signal energy

into a small number of coefficients. Wavelet coefficient sparsity is the key enabler of algorithms such

as wavelet-based denoising by shrinkage [9]. Many natural images consist of smooth or textured regions

separated by edges and are well-suited to wavelet analysis and representation. Other advantages of wavelet

analysis include its multiscale structure, invertibility, and linear complexity filter-bank implementation.

2-D DWT basis functions are easily formed as the tensor products of 1-D DWT basis function along the

vertical and horizontal directions; see Figure 1.

The conventional, real-valued DWT, however, suffers from two drawbacks. The first drawback isshift-

variance: a small shift of the signal causes significant fluctuations in wavelet coefficient energy, making

it difficult to extract or model signal information from the coefficient values. The second drawback is

the lack of a notion ofphaseto encode signal location information as in the Fourier case.

Complex wavelet transforms (CWTs) provide an avenue to remedy these two drawbacks of the DWT.

It is interesting to note that the earliest modern wavelets,those of Grossmann and Morlet [10], were in

1

Fig. 1. Three real wavelets (from the horizontal, vertical, and diagonal subbands, respectively) from the 2-D DWT

basis generated using the length-14 Daubechies filter.

fact complex, and Grossman continually emphasized the power of the CWT phase for signal analysis

and representation. Subsequent researchers have developed orthogonal or biorthogonal CWTs; see, for

example [11]–[16].

A productive line of research has developed over the past decade on thedual-treeCWT, which in 1-D

combines two orthogonal or biorthogonal wavelet bases using complex algebra into a single system, with

one basis corresponding to the “real part” of the complex wavelet and the other to the “imaginary part”

[17]. Ideally, the real and imaginary wavelets are a Hilberttransform pair (90◦ out of phase) and form an

analytic wavelet supported on only the positive frequencies in the Fourier domain, just like the cosine and

sine components of a complex sinusoid. The 1-D dual-tree CWTis a slightly (2×) redundant tight frame,

and the magnitudes of its coefficients are nearly shift invariant [17]. There also exists an approximately

linear relationship between the dual-tree CWT phase and thelocations of 1-D signal singularities [18]

as in the Fourier shift theorem.

The 2-D dual-tree CWT for images is based on the theory of the 2-D Hilbert transform (HT) and

2-D analytic signal as first suggested by Hahn [19]. In particular, a 2-D dual-tree complex wavelet is

formed using the 1-D HT of the usual 2-D real DWT wavelets in the horizontal and/or vertical directions.

The result is a4× redundant tight frame with six directional subbands oriented at multiples of15◦; see

Figure 2 [17], [20]. The 2-D CWT is near shift-invariant, andits magnitude-phase representation has a

complex phase component that encodes shifts of local 1-D structures in images such as edges and ridges

[21]. As a result, the 2-D dual-tree CWT has proved useful fora variety of tasks in image processing

[3], [4], [21], [22].

Each 2-D dual-tree CWT basis function has a single phase angle, which encodes the 1-D shift of image

features perpendicular to its orientation. This may be sufficient for analyzing local 1-D structures such

as edges. However, when the feature under analysis isintrinsically 2-D [25] — for example an image

T-junction [26] — then its relative location is defined in both the horizontal and vertical directions.

This causes ambiguity in the CWT phase shift, whereby we cannot resolve the image shifts in both the

2

(a)

(b)

Fig. 2. Six complex wavelets from the 2-D dual-tree CWT frame generated from orthogonal near-symmetric filters

[23] in the first stage and Q-filters [24] in subsequent stages. (a) Real parts, with approximate even symmetry;

(b) imaginary parts, with approximate odd symmetry.

horizontal and vertical directions from the change of only one CWT coefficient phase. To overcome this

ambiguity, we must conduct a joint analysis with two CWT phases from differently oriented subbands,

which can complicate image analysis and modeling considerably.

In this paper, we explore an alternative theory of the 2-D HT and analytic signal due to Bulow [25],

[27] and show that it leads to an alternative to the 2-D dual-tree CWT. In Bulow’s HT, the 2-D analytic

signal is defined by limiting the 2-D Fourier spectrum to a single quadrant. Applying this theory within

the dual tree framework, we develop and study a new dual-treequaternion wavelet transform(QWT),

where each quaternion wavelet consists of a real part (a usual real DWT wavelet) and three imaginary

parts that are organized by quaternion algebra; see Figure 3. Our QWT, first proposed in [28] and [29],

is a 4× redundant tight frame with three subbands (horizontal, vertical, and diagonal). It is also near

shift-invariant.

The QWT inherits a quaternion magnitude-phase representation from thequaternion Fourier trans-

form (QFT). The first two QWT phases(θ1, θ2) encode the shifts of image features in theabsolute

horizontal/vertical coordinate system, while the third phaseθ3 encodes edge orientation mixtures and

texture information. One major focus of this paper is to demonstratecoherent, multiscaleprocessing

using the QWT, or in other words, the use of its magnitude and phase for multiscale image analysis.

To illustrate the power of coherent processing, we considertwo image processing applications. In

the first application, we develop a new magnitude-and-phase-based algorithm foredge orientation and

offsetestimation in local image blocks. Our algorithm is entirelybased on the QWT shift theorem and

the interpretation of the QWT as a local QFT analysis. In the second application, we design a new

multiscale image disparity estimationalgorithm. The QWT provides a natural multiscale framework

3

(a)

(b)

(c)

Fig. 3. Three quaternion wavelets from the 2-D dual-tree QWT frame.Each quaternion wavelet comprises four com-

ponents that are 90◦ phase shifts of each other in the vertical, horizontal, and both directions. (a) Horizontal subband,

from left to right:φh(x)ψh(y) (a usual, real DWT tensor wavelet),φh(x)ψg(y), φg(x)ψh(y), φg(x)ψg(y),|ψH(x, y)|.(b) Vertical subband, from left to right:ψh(x)φh(y) (a usual, real DWT tensor wavelet),ψh(x)φg(y), ψg(x)φh(y),

ψg(x)φg(y),|ψV(x, y)|. (c) Diagonal subband, from left to right:ψh(x)ψh(y) (a usual, real DWT tensor wavelet),

ψh(x)ψg(y), ψg(x)ψh(y), ψg(x)ψg(y), |ψD(x, y)|. The image on the far right is the quaternion wavelet magnitude

for each subband, a non-oscillating function. The same dual-tree wavelet filters are used as the 2-D dual-tree CWT in

Figure 2.

for measuring and adjusting local disparities and performing phase unwrapping from coarse to fine

scales with linear computational efficiency. The convenient QWT encoding of location information in

the absolute horizontal/vertical coordinate system facilitates averaging across subband estimates for more

robust performance. Our algorithm offers sub-pixel estimation accuracy and runs faster than existing

disparity estimation algorithms like block matching and phase correlation [30]. When many sharp edges

and features are present and the underlying image disparityfield is smooth, our method also exhibits

superior performance over these existing techniques.

Previous work in quaternions and the theory of 2-D HT and analytic signals for image processing

includes Bulow’s extension of the Fourier transform and complex Gabor filters to the quaternion Fourier

transform (QFT) [25]. Our QWT can be interpreted as a local QFT and thus inherits many of its interesting

4

and useful theoretical properties such as the quaternion phase representation, symmetry properties, and the

shift theorem. In addition, the dual-tree QWT sports a linear-time and invertible computational algorithm.

Different extension of the QFT yields the quaternion wavelet pyramid introduced by Bayro-Corrochano

[31]; however, the use of Gabor filters limits its performance and renders its non-invertibility. There

are also interesting connections between the dual-tree QWTand the (non-redundant) quaternion wavelet

representations of Ates and Orchard [32] and Hua and Orchard[33].

Finally, we note that there exists a third alternative 2-D HTcalled theRiesz transformand its associated

analytic signal called themonogenic signal[34]. The monogenic signal, generated by spherical quadrature

filters, has a vector-valued phase that encodesboth the orientations of intrinsically 1-D (edge-like) image

featuresand their shifts normal to the edge orientation. Felsberg’s extension of the monogenic signal can

analyze intrinsically 2-D signals with two phase angles butrequires complicated processing such as local

orientation estimation and steering of basis filters [35].

This paper is organized as follows. We start by briefly reviewing the DWT and dual-tree CWT in

Section II. Section III develops the dual-tree QWT, and Section IV discusses some of its important

properties, in particular its phase response to singularities. We develop and demonstrate the QWT-based

edge geometry and disparity estimation algorithms in Section V. Section VI concludes the paper with

a discussion of the QWT’s potential for future applications. The appendix contains detailed derivations

and proofs of some of the QWT properties and theorems from Sections IV and V.

II. REAL AND COMPLEX WAVELET TRANSFORMS

This section overviews the real DWT and the dual-tree CWT. Wealso develop a new formulation

for the 2-D dual-tree CWT using the theory of 2-D HTs.

A. Real DWT

The real DWT represents a 1-D real-valued signalf(t) in terms of shifted versions of ascaling

functionφ(t) and shifted and scaled versions of awavelet functionψ(t) [36]. The functionsφL,p(t) =

2Lφ(2Lt−p) andψ`,p(t) = 2`ψ(2`t−p), ` ≥ L, p ∈ Z form an orthonormal basis, and we can represent

any f(t) ∈ L2(R) as

f(t) =∑

p∈Z

cL,pφL,p(t) +∑

`≥L,p∈Z

d`,pψ`,p(t), (1)

5

where cL,p =∫

f(t)φL,p(t)dt and d`,p =∫

f(t)ψ`,p(t)dt are the scaling and wavelet coefficients,

respectively. The parameterL sets the coarsest scale space that is spanned byφL,p(t). Behind each

wavelet transform is a filterbank based on lowpass and highpass filters.

The standard real 2-D DWT is obtained using tensor products of 1-D DWTs over the horizontal and

vertical dimensions. The result is the scaling functionφ(x)φ(y) and three subband waveletsφ(x)ψ(y),

ψ(x)φ(y), andψ(x)ψ(y) that are oriented in the horizontal, vertical, and diagonaldirections, respectively

[36] (see Figure 1).

The real wavelet transform suffers fromshift variance; that is, a small shift in the signal can greatly

perturb the magnitude of wavelet coefficients around singularities. It also lacks a notion of phase to

encode signal location information and suffers from aliasing [37]. These issues complicate modeling and

information extraction in the wavelet domain.

B. Dual-tree CWT

The 1-D dual-tree CWT expands a real-valued signal in terms of two sets of wavelet and scaling

functions obtained from two independent filterbanks [17], as shown in Figure 4. We will use the notation

φh(t) andψh(t) to denote the scaling and wavelet functions andchL,pand dh`,p

to denote their corre-

sponding coefficients, whereh specifies a particular set of wavelet filters. The wavelet functionsψh(t)

andψg(t) from the two trees play the role of the real and imaginary parts of a complex analytic wavelet

ψc(t) = ψh(t) + jψg(t). The imaginary waveletψg(t) is the 1-D HT of the real waveletψh(t). The

combined system is a2× redundant tight frame that, by virtue of the fact that|ψc(t)| is non-oscillating,

is near shift-invariant.1

It is useful to recall that the Fourier transform of the imaginary waveletΨg(ω) equals−jΨh(ω)

whenω > 0 and jΨh(ω) whenω < 0. Thus, the Fourier transform of the complex wavelet function

Ψh(ω)+ jΨg(ω) = Ψc(ω) has no energy (or little in practice) in the negative frequency region,2 making

it an analytic signal [17].

1A finitely supported function can never be exactly analytic [37]. In practice, we can only design finite-length complex

wavelets that are approximately analytic, and thus the CWT is only approximately shift-invariant [17], [20].2Note that the Fourier transform of the complex scaling function, Ψh(ω)+ jΨg(ω) = Ψc(ω), is only approximately analytic

in practice, and so its support will leak into the negative frequency region.

6

2

2

2

2

2

2

2

2

2

2

2

2

0

1

d

d

h d

g

g

cg

h c

h (p)

h (p)

0

1

0

1

0

1

0

1g (p)

g (p)

h (p)

h (p)

h (p)

h (p)

g (p)

g (p)

0

1

g (p)

g (p)

1,p

2,p

2,p

dg

h d

h d1,p p

p

3,p

3,p

Fig. 4. The 1-D dual-tree CWT is implemented using a pair of filter banks operating on the same data simultaneously.

Outputs of the filter banks are the dual-tree scaling coefficients,chpandcgp

, and the wavelet coefficients,dh`,pand

dg`,p, at scale and shiftp. The CWT coefficients are then obtained asdh`,p

+ j dg`,p.

C. Hilbert transforms and 2-D CWT

Extending the 1-D CWT to 2-D requires an extension of the HT and analytic signal. There exist not

one but several different definitions of the 2-D analytic signal that each zero out a different portion of the

2-D frequency plane [27]. We will consider two definitions. The first, proposed by Hahn in [19], employs

complex algebra and zeros out frequencies on all but a singlequadrant(kx, ky > 0, for example, where

(kx, ky) indexes the 2-D frequency plane). In this formulation, thecomplete2-D analytic signal consists

of two parts: one having spectrum on the upper right quadrant(kx, ky > 0) and the other on the upper

left quadrant (kx < 0, ky > 0) [27].

Definition 1 [19] Let f be a real-valued, 2-D function. Thecomplete 2-D complex analytic signalis

defined in the space domain,x = (x, y), as the pair of complex signals

fA1(x) = [f(x) − fHi(x)] + j[fHi1(x) + fHi2(x)], (2)

fA2(x) = [f(x) + fHi(x)] + j[fHi1(x) − fHi2(x)], (3)

where

fHi1(x) = f(x) ∗ ∗ δ(y)

πx, (4)

fHi2(x) = f(x) ∗ ∗ δ(x)

πy, (5)

fHi(x) = f(x) ∗ ∗ 1

π2xy. (6)

The functionfHi is the total HT; the functionsfHi1 and fHi2 are thepartial HTs; δ(x) and δ(y) are

impulse sheets along they-axis andx-axis respectively; and∗∗ denotes 2-D convolution.

7

The 2-D complex analytic signal in (2)–(3) is the notion behind the 2-D dual-tree CWT [17], [20].

Each 2-D CWT basis function is a 2-D complex analytic signal consisting of a standard DWT tensor

wavelet plus three additional real wavelets obtained by 1-DHTs along either or both coordinates. For

example, starting from real DWT’s diagonal-subband tensorproduct waveletf(x) = ψh(x)ψh(y) from

above we obtain from equations (4)–(6) its partial and totalHTs

(fHi1 , fHi2 , fHi) = (ψg(x)ψh(y), ψh(x)ψg(y), ψg(x)ψg(y)).

From Definition 1, we then obtain the two complex wavelets

ψc1(x, y) = (ψh(x)ψh(y) − ψg(x)ψg(y)) + j(ψh(x)ψg(y) + ψg(x)ψh(y)), (7)

ψc2(x, y) = (ψh(x)ψh(y) + ψg(x)ψg(y)) + j(ψh(x)ψg(y) − ψg(x)ψh(y)), (8)

having orientations,45◦ and−45◦, respectively. Similar expressions can be obtained for theother two

subbands (±15◦ and±75◦) based onψh(x)φh(y) andφh(x)ψh(y).

Each 2-D CWT coefficient has only a single phase angle, which encodes the 1-D shift of image

features perpendicular to its subband direction. Figure 5(a) illustrates this phase-shift property. This

encoding may be sufficient for local 1-D structures such as edges, since we can define edge shifts

uniquely by a single value, sayr, in the direction perpendicular to the edge. However, even in this case,

the analysis is not so straightforward when the edge does notalign with the six orientations of the CWT

subbands. Moreover, shifts of intrinsically 2-D (non-edge) image features such as in Figure 5(a) require

two values(r1, r2) in thex andy directions, respectively. This creates ambiguity in the CWT phase shift.

We can resolve this ambiguity by using the coefficients from two CWT subbands, but this complicates

the use of the CWT for image analysis, modeling, and other image processing applications. In contrast,

Figure 5(b) illustrates a more convenient encoding of imageshifts in absolutex, y-coordinates (with two

phase angles) using the quaternion phases of our new QWT, to which we now turn our attention.

III. QUATERNION WAVELET TRANSFORM (QWT)

A. Quaternion Hilbert transform

There are several alternatives to the 2-D analytic signal ofDefinition 1; we focus here on one due

to Bulow [27]. It combines the partial and total HTs from (4)–(6) to form an analytic signal comprising

a real part and three imaginary components that are manipulated usingquaternionalgebra [25].

The set of quaternionsH = {a+j1b+j2c+j3d | a, b, c, d ∈ R} has multiplication rulesj1j2 = −j2j1 =

j3 and j21 = j22 = −1 as well as component-wise addition and multiplication by real numbers [38].

8

(a)

r y

x (b)

r

r1

2

y

x

Fig. 5. (a) The CWT coefficient’s single phase angle responds linearly to image shiftr in a direction orthogonal to

the wavelet’s orientation. (b) Two of the QWT coefficient’s three phase angles respond linearly to image shifts(r1, r2)

in an absolute horizontal/vertical coordinate system.

Additional multiplication rules include:j23 = −1, j2j3 = −j3j2 = j1 and j3j1 = −j1j3 = j2. Note that

quaternionic multiplication is not commutative. The conjugateq∗ of a quaternionq = a+ j1b+ j2c+ j3d

is defined byq∗ = a− j1b− j2c− j3d while the magnitude is defined as|q| =√qq∗.

An alternative representation for a quaternion is through its magnitude and three phase angles:q =

|q| ej1θ1ej3θ3ej2θ2 [25], where(θ1, θ2, θ3) are the quaternion phase angles, computed using the following

formulae (forq normalized, i.e.,|q| = 1):

θ3 = −1

2arcsin(2(bc− ad)) , (9)

and, in the regular case (i.e., whenθ3 ∈ (−π4, π

4)),

θ1 =1

2arctan

(

2(bd+ ac)

a2 + b2 − c2 − d2

)

, (10)

θ2 =1

2arctan

(

2(cd+ ab)

a2 − b2 + c2 − d2

)

. (11)

In thesingular case, i.e., whenθ3 = ±π4, θ1 andθ2 are not uniquely defined. Only the sum (ifθ3 = −π

4)

or the difference (ifθ3 = π4) of θ1 and θ2 is unique [25]. If (θ1,θ2,θ3) calculated from (9)–(11) satisfy

ej1θ1ej3θ3ej2θ2 = −q, subtractθ3 by π if θ3 ≥ 0; add π to θ3 if θ3 < 0. As a result, each quaternion

phase angle is uniquely defined within the range(θ1, θ2, θ3) ∈ [−π, π) × [−π2, π

2) × [−π

4, π

4].

The operation of conjugation in the usual set of complex numbers,C = a+ jb, wherea, b ∈ R and

j2 = −1, is a so-calledalgebra involutionthat fulfills the two following properties for anyz,w ∈ C:

9

(z∗)∗ = z and (wz)∗ = w∗z∗. In H, there are three nontrivial algebra involutions:

α1(q) = −j1qj1 = a+ j1b− j2c− j3d, (12)

α2(q) = −j2qj2 = a− j1b+ j2c− j3d, (13)

α3(q) = −j3qj3 = a− j1b− j2c+ j3d. (14)

Using these involutions we can extend the definition of Hermitian symmetry. A functionf : R2 → H

is calledquaternionic Hermitianif, for each(x, y) ∈ R2,

f(−x, y) = α2(f(x, y)), f(x,−y) = α1(f(x, y)) andf(−x,−y) = α3(f(x, y)). (15)

Bulow introduces an alternative definition of 2-D analyticsignal based on thequaternion Fourier

transform(QFT) [25]. The QFT of a 2-D signalf is given by

Fq{f} = F q(u) =

∫

R2

e−j12πuxf(x) e−j22πvydx, (16)

whereFq denotes the QFT operator,u = (u, v) indexes the QFT domain and the quaternion exponential

e−j12πux e−j22πvy (17)

is the QFT basis function. The real part of (17) iscos(2πux) cos(2πvy), while the other three quaternionic

components are its partial and total HTs as defined in (4)–(6). Note that the QFT of a real-valued signal is

quaternionic Hermitian, and each QFT basis function satisfies the definition of a 2-Dquaternion analytic

signal.

Definition 2 [27] Let f be a real-valued 2-D signal. The 2-Dquaternion analytic signalis defined as

fqA(x) = f(x) + j1fHi1(x) + j2fHi2(x) + j3fHi(x), (18)

where the functionsfHi1 , fHi2 andfHi are defined as in (4)–(6).

B. QWT construction

Our new 2-D dual-tree QWT rests on the quaternion definition of 2-D analytic signal. By organizing

the four quadrature components of a 2-D wavelet (the real wavelet and its 2-D HTs) as a quaternion, we

obtain a 2-D analytic wavelet and its associatedquaternion wavelet transform(QWT). For example, for

the diagonal subband, with(f, fHi1 , fHi2 , fHi) = (ψh(x)ψh(y), ψg(x)ψh(y), ψh(x)ψg(y), ψg(x)ψg(y)),

we obtain the quaternion wavelet

ψD(x, y) = ψh(x)ψh(y) + j1ψg(x)ψh(y) + j2ψh(x)ψg(y) + j3ψg(x)ψg(y). (19)

10

FS1FS2

FS3FS4

(a) Fq{ψh(x)ψh(y)}

-j1FS1j1FS2

j1FS3-j1FS4

(b) Fq{ψg(x)ψh(y)}

FS1(-j2)FS2

(-j2)

FS3j2 FS4

j2

(c) Fq{ψh(x)ψg(y)}

j1FS1j2-j1FS2

j2

j1FS3j2 -j1FS4

j2

(d) Fq{ψg(x)ψg(y)}

Fig. 6. Quaternion Fourier domain relationships among the four quadrature components of a quaternion wavelet

ψD(x, y) in the diagonal subband. The QFT spectra of the real waveletψh(x)ψh(y) in the first to fourth quadrants

are denoted by(FS1, FS2

, FS3, FS4

) respectively. The partial and total Hilbert transform operations are equivalent to

multiplying the quadrants ofFq{ψh(x)ψh(y)} in (a) by±j1, or±j2, or both.

To compute the QWT coefficients, we can use a separable 2-D implementation [20] of the dual-tree

filter bank in Figure 4. During each stage of filtering, we independently apply the two sets ofh andg

wavelet filters to each dimension (x andy) of a 2-D image; for instance, applying the set of filtersh to

both dimensions yields the scaling coefficientschhL,pand the diagonal, vertical, and horizontal wavelet

coefficients,dDhh`,p

, dVhh`,p

, anddHhh`,p

respectively. Therefore, the resulting 2-D dual-tree implementation

comprises four independent filter banks (corresponding to all possible combinations of wavelet filters

applied to each dimension (hh, hg, gh, and gg)) operating on the same 2-D image. We combine the

wavelet coefficients of the same subband from the output of each filter bank using quaternion algebra to

obtain the QWT coefficients; for example, for the diagonal subband:dD`,p = dD

hh`,p+ j1d

Dgh`,p

+ j2dDhg`,p

+

j3dDgg`,p

.

Figure 3(c) illustrates the four components of a quaternionwavelet and its quaternion magnitude

for the diagonal subband. The partial and total HT components resembleψh(x)ψh(y) but are phase-

shifted by90◦ in the horizontal, vertical, and both directions. The magnitude of each quaternion wavelet

(square root of the sum-of-squares of all four components) is a smooth bell-shaped function. We can also

interpret the four components ofψD(x, y) in the QFT domain as multiplying the quadrants of the QFT

of ψh(x)ψh(y) by ±j1 or ±j2, or both, as shown in Figure 6. Note that the order of multiplication is

important because quaternion multiplication is non-commutative. This quaternion wavelet,ψD(x, y), has

support in only a single quadrant of the QFT domain (see Appendix A).

The construction and properties are similar for the other two subband quaternion wavelets based on

φh(x)ψh(y) andψh(x)φh(y) (see the horizontal and vertical subband wavelets,ψH(x, y) andψV(x, y) in

11

Figures 3(a) and (b), respectively). In summary, in contrast with the six complex pairs of CWT wavelets

(12 functions in total), the QWT sports three quaternion sets of four QWT wavelets (12 functions in

total).

Finally, note that the quaternion wavelet transform is approximately a windowed quaternion Fourier

transform (QFT) [25]. In contrast to the QFT in (16), the basis functions for the QWT are scaled and

shifted versions of the quaternion wavelets(ψh(x)+j1ψg(x))(φh(y)+j2φg(y)), (φh(x)+j1φg(x))(ψh(y)+

j2ψg(y)), and(ψh(x) + j1ψg(x))(ψh(y) + j2ψg(y)).

IV. QWT PROPERTIES

Since the dual-tree QWT is based on combining 1-D CWT functions, it preserves many of the

attractive properties of the CWT. Furthermore, the quaternion organization and manipulation provide

new features not present in either the 2-D DWT or CWT. In this section, we discuss some of the key

properties of the QWT with special emphasis on its phase.

A. Tight frame

The QWT comprises four orthonormal basis sets and thus formsa 4× redundant tight frame. The

components of the QWT wavelets at each scale can be organizedin matrix form as

G =

ψh(x)ψh(y) ψh(x)φh(y) φh(x)ψh(y)

ψg(x)ψh(y) ψg(x)φh(y) φg(x)ψh(y)

ψh(x)ψg(y) ψh(x)φg(y) φh(x)ψg(y)

ψg(x)ψg(y) ψg(x)φg(y) φg(x)ψg(y)

. (20)

The frame contains shifted and scaled versions of the functions inG plus the scaling functions. Each

column of the matrixG contains the four components of the quaternion wavelet corresponding to a

subband of the QWT. For example, the first column contains thequaternion wavelet components in

Figure 3(c), that is, the tensor product waveletψh(x)ψh(y) and its 2-D partial and total HTs in (19).

Each row ofG contains the wavelet functions necessary to form one orthonormal basis set. SinceG has

four rows, the total system is a4× redundant tight frame. An important consequence is that theQWT is

stably invertible. The wavelet coefficients correspondingto the projections onto the functions inG can

be computed using a 2-D dual-tree filter bank with linear computational complexity.

12

B. Relationship to the 2-D CWT

A unitary transformation links the QWT frame elements and coefficients and the 2-D CWT frame

elements and coefficients. The components of the CWT wavelets at each scale can be organized in matrix

form as

C =1√2

ψh(x)ψh(y) + ψg(x)ψg(y) ψh(x)φh(y) + ψg(x)φg(y) φh(x)ψh(y) + φg(x)ψg(y)

ψg(x)ψh(y) + ψh(x)ψg(y) ψg(x)φh(y) + ψh(x)φg(y) φg(x)ψh(y) + φh(x)ψg(y)

ψg(x)ψh(y) − ψh(x)ψg(y) ψg(x)φh(y) − ψh(x)φg(y) φg(x)ψh(y) − φh(x)ψg(y)

ψh(x)ψh(y) − ψg(x)ψg(y) ψh(x)φh(y) − ψg(x)φg(y) φh(x)ψh(y) − φg(x)ψg(y)

.

(21)

The columns ofC contain the complex wavelets oriented at±45◦, ±15◦, and±75◦, respectively.

We obtain the CWT wavelets by multiplying the matrixG in (20) by the unitary matrix

U =1√2

1 0 0 1

0 1 1 0

0 1 −1 0

1 0 0 −1

. (22)

SinceC = UG, the CWT also satisfies the tight-frame property with the same 4× redundancy factor. As

we will see in the Section IV-D, both the CWT phase and the QWT phases encode 2-D image feature

shifts; however there exists no straightforward relationship between the phase angles of the QWT and

CWT coefficients.

C. QWT and QFT

To make conrete the interpretation of the QWT as a local QFT, we derive a QFT Plancharel theorem

and an inner product formula in the QFT domain.

QFT Plancharel Theorem.Let f(x) be a real-valued 2-D signal, letw(x) be a quaternion-valued 2-D

signal, and letF q(u) andW q(u) be their respective QFTs. Then the inner product off(x) andw(x) in

the space domain equals the following inner product in the QFT domain∫

R2

f(x)w(x)dx =

∫

R2

α3(Fq·e(u))W q(u) + α3(F

q·o(u))α2(W

q(u))du, (23)

whereα3(·) andα1(·) are the algebra involutions defined in (12)–(14). The functionsF q·e andF q

·o are

respectively the even and odd components ofF q with respect to the spatial coordinatey, as defined in

13

the QFT convolution theorem [25]

F q·e(u) =

∫

R2

e−j12πux f(x) cos(2πvy)dx, (24)

F q·o(u) =

∫

R2

e−j12πux f(x) j2 sin(−2πvy)dx. (25)

We call the right side of (23) theQFT inner productbetweenF q(u) andW q(u). The proof appears

in Appendix B.

The QFT Plancharel Theorem enables us to interpret the QWT asa local or windowed QFT. Let

w(x) be a quaternion wavelet at a particular scale and subband andf(x) be a real image under analysis.

Their QFT inner product in (23) gives the corresponding QWT coefficient. Since quaternion wavelets

have a single-quadrant QFT spectrum as shown in Appendix A, the integration limit of the QFT inner

product reduces toS1 = {(u, v) : u ≥ 0, v ≥ 0} in the QWT case.

D. QWT phase properties

Recall from Section III-A that each QWT coefficient can be expressed in terms of its magnitude and

phase asq = |q| ej1θ1ej3θ3ej2θ2 . We seek a shift theorem for the QWT phase that is analogous tothat

for the CWT. Since the QWT performs a local QFT analysis, the shift theorem for the QFT [25] holds

approximately for the QWT. When we shift an image fromf(x) to f(x− r), the QFT phase undergoes

the following transformation

(θ1(u), θ2(u), θ3(u)) → (θ1(u) − 2πur1, θ2(u) − 2πvr2, θ3(u)) (26)

wherer = (r1, r2) denotes the shift in the horizontal/vertical spatial coordinate system.

To transfer the shift theorem from the QFT to the QWT, we exploit the fact that the QWT is

approximately a windowed QFT. That is, each quaternion wavelet is approximately a windowed quaternion

exponential from (17), and each QWT coefficient is the inner product (as in (23)) between this windowed

quaternion exponential and the image. The scale` of analysis controls the center frequency(u′, v′) of

the windowed exponential in the QFT plane.

The magnitude and phase of the resulting coefficient are determined by two factors: the spectral

content of the imageF q(u) and the center frequency(u′, v′) of the wavelet. These two factors determine

the frequency parameters(u, v) we should use in the shift theorem for the QFT (26). We term(u, v)

the effective center frequencyfor the corresponding wavelet coefficient. Fortunately, for images having a

14

Fig. 7. Effect of varyingθ3 on the structure of the corresponding weighted quaternion waveletdD`,p ψ

D`,p for the

diagonal subband (from left to right):θ3 = −π4 ,−π

8 , 0,π8 ,

π4 . The corresponding weighted wavelet changes from

textured (θ3 = 0) to oriented (θ3 = ±π4 ).

smooth spectrum over the support of the quaternion wavelet in the QFT domain,(u, v) ≈ (u′, v′). Note

that (u, v) should always lie in the first quadrant of the QFT domain (thatis, u, v ≥ 0).

Thus, to apply the shift theorem in QWT image analysis, we first estimate the effective center

frequency for each QWT coefficient (or assume that(u, v) ≈ (u′, v′)). Then, using (26), we can estimate

the shift(r1, r2) of one image relative to a second image from the phase change(∆θ1,∆θ2). Conversely,

we can estimate the phase shift once we know the image shift.

Finally, a quick word on the quirky third QWT phase angleθ3. We can interpretθ3 as the relative

amplitude of image energy along two orthogonal directions as in [25], which is useful for analyzing

texture information. For example, Figure 7 depicts the function dD`,p ψ

D`,p for the diagonal subband as we

adjustθ3 of the wavelet coefficientdD`,p. We see a gradual change in appearance from an oriented function

to texture-like and back. This property could prove useful for the analysis of images with rich textures

[25]. As we describe below in Section V-A, the third phase also relates to the orientation of a single

edge.

V. QWT APPLICATIONS

In this section, we demonstrate the utility of the QWT with two applications in geometrical edge

structure and image disparity estimation.

A. Edge geometry estimation

Edges are the fundamental building blocks of many real-world images. Roughly speaking, a typical

natural image can be considered as a piecewise smooth signalin 2-D that contains blocks of smooth

regions separated by edge discontinuities due to object occlusions. Here we use the QWT magnitude and

phase to extend the multiscale edge geometry analysis of [18], [21].

15

β

r

y

x

(a) single edge model (wedgelet)

( u , v )

u

v positive quadrant

leakage leakage quadrant

β

β

(b) edge QFT spectrum

Fig. 8. (a) Parameterization of a single edge in a dyadic image block(a wedgelet [39]). (b) QFT spectrum of the

edge; shaded squares represent the quaternion wavelets in the horizonal, vertical, and diagonal subbands. The energy

of the edge is concentrated along the two dark lines crossingat the origin and is captured by the vertical subband with

effective center frequency at quaternion frequency(u, v). The region bounded by the dashed line demonstrates the

spectral support of the QWT basis “leaking” into the neighboring quadrant.

1) Theory: Consider an imagef(x) containing a dyadic image block that features a single step edge,

as parameterized in Figure 8(a). Note that for an edge oriented at angleβ, any shift(r1, r2) in the (x, y)

directions satisfying the constraint

r1 cos β + r2 sin β = r (27)

is identical to a shift from the center of the block byr in the direction perpendicular toβ.

Our goal in this section is to analyze the phase(θ1, θ2, θ3) of the quaternion wavelet coefficient (e.g.,

dV`,p for the vertical subband) corresponding to the quaternion waveletψV

`,p(x) whose support aligns with

the dyadic image block containing the edge (the other subbands behave similarly). We will show that

(θ1,θ2) andθ3 provide an accurate means with which to estimate the edge offsetr and edge orientation

β respectively.

First, we establish a linear relationship between (θ1,θ2) and the edge offsetr using the shift theorem

in (26). Recall from Section IV-D that the effective center frequency(u, v) depends on both the image

QFT spectral contentF q(u) and the center frequency of the quaternion wavelet, and it always lies in

the first quadrant. Since the spectral energy of the edge QFT,F q(u), concentrates along two impulse

ridges through the origin having orientations90◦ − β andβ − 90◦ in the QFT domain (see Figure 8(b)

and Appendix C), we can write(u, v) = c(| cos β|, | sin β|), wherec is a positive constant that depends

on β, the subband, and the scale of analysis`. When the edge passing through the image block center

16

displaces perpendicularly byr, the changes in phase angles (∆θ1,∆θ2) satisfyr1 = ∆θ1

2πuandr2 = ∆θ2

2πv.

Plugging(u, v) = c(| cos β|, | sin β|) and using (27), we obtain the concise formula

r =∆θ1 ± ∆θ2

2πc, (28)

where we choose∆θ1 + ∆θ2 when tan β > 0, and∆θ1 − ∆θ2 when tan β < 0. We have verified this

relationship via experimental analysis of straight edges in detail in an earlier paper [28].

Based on the interpretation of the QWT as a local QFT, we use the inner product formula in (23) to

analyze the behavior ofθ3 for the same edge block. The QWT coefficientdV`,p can be computed from

the QFT inner product betweenψV`,p(x) and the QFT of the edge imagef(x) (similarly for the other

subbands). Our analysis in Appendix D states that if the quaternion wavelet is perfectly analytic, then

regardless of(β, r), θ3 = π4

when tan β > 0 andθ3 = −π4

when tan β < 0. Note that this corresponds

to the singular case in the quaternion phase calculation in Section III-A.

However, practical quaternion wavelets will not be perfectly analytic, and so their QFT support will

leak into other quadrants of the QFT domain as in Figure 8(b).This necessitates the more in-depth

analysis of Appendix E, which shows that in this case

θ3 =1

2arcsin

(

1 − ε

1 + ε

)

, (29)

where ε is a measure of the ratio of local signal energy in the positive quadrant to the energy in the

leakage quadrant. For the vertical subband as shown in Figure 8(b), when the edge orientationβ changes

from 0◦ to 45◦, this ratio ε changes from1 to 0 and thusθ3 changes from0 to π4. We model this

behavior ofθ3 in the horizontal and vertical subbands to design an edge orientation estimation in the

next section. Since the diagonal subband wavelet has QFT support distant from the leakage quadrants,

this QWT subband coefficientsdD`,p are almost unaffected by leakage (i.e.,ε ≈ 0). Their corresponding

θ3 approximately equal±π4

and do not vary withβ.

2) Practice: Based on the above analysis, we propose a hybrid algorithm toestimate the edge

geometry(β, r) based on the QWT phase(θ1, θ2, θ3) and the magnitude ratios between the three subbands.

We generate a set of wedgelets with knownβ andr (see Figure 8(a) [39]) for analysis and testing. Our

algorithm is reminiscent of the edge estimation scheme in [21].

To estimate the edge orientationβ, we use both the magnitude ratios among the three subbands and θ3

of the subband with the largest magnitude. The subband with the largest magnitude gives the approximate

orientation of the edge (±45◦ for diagonal,±15◦ for horizontal and±75◦ for vertical); the sign ofθ3

tells whether the direction is positive or negative. We experimentally analyze the QWT magnitude ratios

17

t

(a) (b) (c) (d) (e)

Fig. 9. Local edge geometry estimation using the QWT. (a) Several edgy regions from the “cameraman” image are

shown on the left; (b)–(e) on the right are edge estimates from the corresponding QWT coefficients. The upper row

shows the original image region, the lower row shows a wedgelet (see Figure 8(a)) having the edge parameter estimates

(β, r). (No attempt is made to capture the texture within the block.)

andθ3 of the set of generated wedgelets corresponding to changingedge orientationsβ by multiples of

5◦. Using standard curve-fitting techniques, we develop a simple relationship between these parameters

andβ for our orientation estimation scheme. The resulting orientation estimation algorithm achieves a

maximum error of onlyβ ± 3◦ in practice for ideal edges.

To estimate the offsetr of the edge, we apply the relationship between (θ1,θ2) andr in (28). We use

(θ1, θ2) from either the horizontal or vertical subband (whichever has a larger magnitude). Depending on

tan β, we compute the sum (or difference) of the change in phase angles ∆θ1 ±∆θ2 for the edge under

analysis, usingr = 0 as the reference edge. Upon analysis of the simulated wedgelets with knownr,

we estimatec ≈ 0.7, to be used in (28). Our final edge offset estimation algorithm achieves a maximum

error of approximately±0.02 relative to the normalized unit edge length of the dyadic block (that is,

sub-pixel accuracy). More details of the experimental analysis for the wedgelet model can be found

in [28]. According to (28), within a2π-range of∆θ1 ± ∆θ2, the range ofr is limited to an interval of

length 1c≈ 1.43 which ensures that the edge stays within the image block under analysis. Therefore, in

our offset estimation, we need only consider one2π-range of∆θ1 ± ∆θ2 and do not need to perform

any “phase-unwrapping”.

Finally, we estimate thepolarity of the edge. By first obtaining an offset estimate for eachpolarity

of the edge with orientationβ estimated above, namelyr+ and r−, we use the inner product between

the image block and two wedgelets with the estimated edge parameters(β, r+) and(β, r−) to determine

the correct polarity. Although our calculation of estimation accuracy is based on the wedgelet model, our

18

algorithm also works well for real-world images such as the popular “cameraman” image in Figure 9.

Our results demonstrate the close relationship between edge geometry and QWT magnitude and

phases, in particular, the encoding of edge location in the QWT phases (θ1,θ2) and the encoding of edge

orientation in the QWT magnitude and third phaseθ3.

B. Image disparity estimation

In this section, as another example of QWT-based data processing, we present an algorithm to

estimate thelocal disparitiesbetween the target imageA(x, y) and the reference imageB(x, y). Disparity

estimationis the process of determining the local translations neededto align different regions in two

images, that is, the amount of 2-D translation required to move a local region of a target image centered

at pixel (x′, y′) to align with the region in a reference image centered at the same location(x′, y′). This

problem figures prominently in a range of image processing and computer vision tasks, such as video

processing to estimate motion between successive frames, time-lapse seismic imaging to study changes

in a reservoir over time, medical imaging to monitor a patient’s body, super-resolution, etc.

Recall that the QWT phase property states that a shift(r1, r2) in an image changes the QWT phase

from (θ1, θ2, θ3) to (θ1 − 2πur1, θ2 − 2πvr2, θ3). Thus, for each QWT coefficient, if we know(u, v), the

effective center frequency of the local image region analyzed by the corresponding QWT basis functions,

then we can estimate the local image shifts(r1, r2) from the phase differences.

However, the center frequency(u, v) is image dependent in general. To be able to estimate image

shifts from QWT phase differences, we first need to estimate(u, v) for each QWT coefficient. For this

estimate, we can again use the QWT phase properties. If we know the image shifts and measure the

phase difference, then we can compute(u, v).

By manually translating the reference imageA(x, y) by known small amounts both horizontally and

vertically, we obtain two imagesA(x, y) andA(x− r1, y − r2). After computing the QWTs ofA(x, y)

andA(x − r1, y − r2), we can use the phase differences(∆θ1,∆θ2) between the QWT coefficients to

obtain estimates for the effective spectral center(u, v) for each dyadic block across all scales asu = ∆θ1

2πr1

andv = ∆θ2

2πr2

. The range of QWT phase angles limits our estimates(u, v) to[

− 12R, 1

2R

)

and[

− 14R, 1

4R

)

for horizontal and vertical shifts, respectively, whereR is the length of one side of the dyadic block

corresponding to each coefficient.

Once we know the center frequency(u, v) for each QWT coefficient, we can estimate the local image

shifts by measuring the difference between the QWT phase corresponding to the same local blocks in

19

imageA(x, y) andB(x, y).

A key challenge in phase-based disparity estimation is resolving the phase wrap-around effect due to

the limited range of phase angles. Due to phase wrapping, each observed phase difference can be mapped

to more than one disparity estimate. Specifically, for QWT phase differences(∆θ1,∆θ2) between the

reference and target images, we can express the possible image shifts of each dyadic block as

r1 =∆θ1 + π(2n + k)

2πu, r2 =

∆θ2 +mπ

2πv, (30)

wheren,m ∈ Z andk ∈ {0, 1}. Depending onm, k is chosen such that it equals0 whenm is even and

equals1 whenm is odd. The special wrap-around effect in (30) is due to the limited range inθ1 andθ2

(to [−π, π) and [−π2, π

2) respectively).

In our multiscale disparity estimation technique, we use coarse scale shift estimates to help unwrap

the phase in the finer scales. If we assume that the true image shift is small compared to the size of

dyadic squares at the coarsest scaleL, then we can setm = n = k = 0 in (30) at this scale (no

phase wrap-around) and obtain correct estimates forr1 andr2. Effectively, this assumption of no phase

wrap-around at the coarsest scale limits the maximum image shift that we can estimate correctly. Once

we have shift estimates at scaleL, for each block at scale= L− 1, we estimate the shifts as follows:

1) interpolate the estimates from the previous scale(s) to obtain predicted estimates(rp1 , r

p2),

2) substitute(∆θ1,∆θ2) into (30) and determine the(n, k,m) such that(r1, r2) is closest to(rp1, r

p2),

3) remove any unreliable(r1, r2),

4) repeat Steps 1–3 for the finer scales` = L− 2, L− 3, . . .

Step 1 uses either nearest-neighbor interpolation (which gives higher estimationaccuracy) or bilinear

interpolation (which results in a smoother disparity field for better visual quality). We choose the latter

in our simulations in this paper. In Step 3, we use a similar reliability measure as in the confidence mask

[25] to threshold unreliable phase and offset estimates. Wealso threshold based on the magnitude of the

QWT coefficients. We iterate the above process until a fine enough scale (e.g., = 2), since estimates

typically become unreliable at this scale and below. The QWTcoefficients for the small dyadic blocks

have small magnitudes, and so their phase angles are very sensitive to noise.

We can improve upon the basic iterative algorithm by fusing estimates across subbands and scales.

First, with proper interpolation, we can average over estimates from all scales containing the same image

block. Second, we can average estimates from the three QWT subbands for the same block to yield more

accurate estimates, but we need to discard some unreliable subband estimates (for example, horizontal

20

(a) reference image (b) target image (c) disparity estimates

Fig. 10. Multiscale QWT phase-based disparity estimation results.(a), (b) Reference and target images from the

“Rubik’s cube” image sequence [25]. (c) Disparity estimates between two images in the sequence, shown as arrows

overlaid on top of the reference image (zoomed in for better visualization, arrow lengths proportional to magnitude of

disparity).

disparityr1 in the horizontal subband andr2 in the vertical subband). We incorporate these subband/scale

averaging steps into each iteration of Steps 1–4.3

Figure 10 illustrates the result of our QWT phase-based disparity estimation scheme for two frames

from the rotating Rubik’s cube video sequence [25]. This is an interesting sequence, because a rotation

cannot be captured by a single global translation but can be closely approximated by local translations.

The arrows indicate both the directions and magnitudes of the local shifts, with the magnitudes stretched

for better visibility. We can clearly see the rotation of theRubik’s cube on the circular stage, with larger

translations closer to the viewer (bottom of the image) and smaller translations further from the viewer

(top of the image). In our experiments, we obtained the most robust estimations by averaging over both

scales and subbands.

Figure 11 demonstrates the multiscale nature of our disparity estimation approach on an image

sequence of a living heart [40]. The presence of sharp edges plus the smooth cardiac motion in these

images is well-matched by our wavelet-based analysis. Our algorithm averages over several scales

` to obtain motion fields in various levels of detail. Figure 11(b) and (c) clearly show the coarse-

scale contraction and relaxation motions of the heart, while Figure 11(e) and (f) display more detailed

movements of the heart muscles, arteries and blood flow (in particular, see the arrows toward left region

of the heart near the artery).

In addition to visualizing the changes from one image to another, we can use our algorithm as the

3Matlab software is available at http://www.dsp.rice.edu/software/qwt.shtml.

21

(a) reference image (heartduring contraction)

(b) coarse-scale disparityestimates (during contraction)

(c) coarse-scale disparityestimates (during relaxation)

(d) target image (heartduring contraction)

(e) fine-scale disparityestimates (during contraction)

(f) fine-scale disparityestimates (during relaxation)

Fig. 11. Multiscale QWT phase-based disparity estimation results for the “heart” image sequence. (a), (d) Reference

and target “heart ”images from two frames during heart contraction (systole). Estimation results show (b)-(c) coarse-

scale and (e)-(f) fine-scale detailed motion of the heart andblood flow during the contraction and expansion (systole

and diastole) phases of the cardiac cycle, illustrating themultiscale nature of our algorithm.

feature matching step of animage registrationalgorithm in the process, using the disparity information

to build a warping function to align the images. One important note is that our QWT-based method

is region-basedin that it does not require any detection of feature points orother image landmarks.

Traditional region-based feature matching methods, whichestimate the spatial correlation (or correspon-

dence) between two images, includeblock-matchingand Fourier methods. For comparison, we compare

our approach to theexhaustive search(ES) block matching technique, which is very computationally

demanding but features the best performance among all general block-matching techniques. We also

compare to a Fourier sub-pixel motion estimation method known asGradient Correlation(GC), which

has been shown to have better PSNR performance than other recent Fourier methods [30].

As a performance measure, we use the Peak Signal-to-Noise Ratio (PSNR) between the motion

22

0 5 10 15 2035

36

37

38

39

40

41

42

Frame Number

PS

NR

QWT

GradCorr

ES

(a) PSNR vs. frame number

QWT GC ES

Rubik 39.4 36.7 37.2

Heart 35.7 35.5 38.2

Taxi 36.2 36.6 37.0

Computation O(N) O(N logN) O(N2)

cost

(b) average PSNR performance (in dB)

Fig. 12. Comparison of multiscale QWT phase-based disparity estimation with two motion estimation algorithms,

Gradient Correlation (GC) [30] and exhaustive search (ES).The performance measure is PSNR (in dB) between

the motion-compensated image and the target image of four test image sequences (“Rubik”, “Heart” and “Taxi”).

(a) Frame-by-frame PSNR performance comparison in the “Rubik” sequence. (b) Table of average PSNR performance

(over all frames) for each test sequence. The multiscale QWTphase-based method demonstrates the best performance

among the three test algorithms for the “Rubik” sequence andshows comparable performance to the other algorithms

for the “Heart” and “Taxi” sequences. Last row of table showsthe computational complexity of each algorithm.

compensated imageC(x, y) and the target imageB(x, y), which is given by

10 log10

(

(255)2N∑

x(B(x) − C(x))2)

)

, (31)

whereN is the number of image pixels. The motion compensated imageC(x, y) is obtained by shifting

each image block in the reference imageA(x, y) according to the estimated motion vectors. Figure 12

compares the results for four image sequences: the “Rubik” and “Taxi” sequences commonly used in the

optical flow literature, and the “Heart” sequence from Figure 11. Figure 12(a) demonstrates the superior

performance of our QWT phase-based algorithm over the otheralgorithms for the “Rubik” sequence,

which has piecewise-smooth image frames and a smooth underlying disparity flow. While its PSNR

performance is relatively far from ES for the “Heart” sequence, we note that the QWT phase-based

approach provides a motion field that is more useful for patient monitoring and diagnosis. For the “Taxi”

and “Mobcal” sequences, which contain discontinuities in their underlying flows, the QWT phase-based

algorithm sports comparable performance (see the table in Figure 12(b)). Since the multiscale averaging

step in our algorithm tends to smooth out the estimated flow, it should not be expected to perform as

well for discontinuous motions fields of rigid objects moving past each other.

Additional advantages of our QWT-based algorithm include its speed (linear computational complex-

23

ity) and sub-pixel estimation accuracy. For anN -pixel image, ourO(N) algorithm is more efficient than

theO(N logN) FFT-based GC and significantly faster than ES, which can takeup toO(N2) with the

search parameter on the order ofN . General block-matching techniques such as ES can only decipher

disparities in an integer number of pixels. On the other hand, our QWT-based algorithm can achieve sub-

pixel estimation and demonstrates greater accuracy for the“Rubik” sequence than existing phase-based

sub-pixel estimation methods such as GC.

Besides gradient correlation, there exist other phase-based algorithms for disparity estimation and

image registration [25], [31], [41]–[44]. These approaches use phase as a feature mapΦ(x) where the

phase functionΦ maps 2-Dx, y-coordinates to phase angles. They assume the phase function to stay

constant upon a shift from the reference image to the target image; that is,Φ1(x) = Φ2(x+ r) whereΦ1

is the phase function for the reference image andΦ2 for the target image. Then, the disparity estimation

problem is simplified to calculating the optical flow for these phase functions [41], [45]. In contrast, our

algorithm is entirely based on the multiscale dual-tree QWTand its shift theorem.

Our approach is similar to Bayro-Corrochano’s QWT disparity estimation algorithm in [31] in its use

of quaternion phase angles. However, the latter approach requires the design of a special filter to compute

the phase derivative function in advance, while our approach need only estimate the local frequencies

(u, v). Our implementation also uses a dual-tree filterbank, as compared to the quaternion wavelet pyramid

of Gabor filters in [31]. Provided acontinuousunderlying disparity flow, our algorithm yields a denser

and more accurate disparity map, even for smooth regions within an image. Incorporating an optimization

procedure such as in [44] or a statistical model into our current algorithm can further improve estimation

accuracy, particularly for blocks with phase singularity,but requires extra computation time.

Kingsbury et al. have developed a multiscale displacement estimation algorithm based on the 2-D

CWT [41], [42]. Their approach combines information from all six CWT subbands in an optimization

framework based on the optical flow assumptions. In additionto disparity estimation, it simultaneously

registers the target image to the reference image. In comparison, both the QWT and CWT methods

are multiscale and wavelet-based and are thus, in general, best for smooth underlying disparity flows.

However, our algorithm is much simpler and easier to use because it does not involve the tuning of many

input parameters for the iterative optimization procedures as in the CWT algorithm. While our method

estimates local disparities without warping the image, we can apply any standard warping procedure to

easily register the two images from the estimated disparities. Thanks to the ability of the QWT to encode

location information in absolute horizontal/vertical coordinates, we can easily combine the QWT subband

24

estimates to yield more accurate flow estimation results. Combining subband location information in the

2-D CWT is more complicated, since each subband encodes the disparities by complex phase angles in

a reference frame rotated from other subbands. Based on our experimental results and comparison of the

design of our flow estimation algorithm with previous approaches, the QWT demonstrates its ability to

efficiently represent, encode and process location information in images.

VI. CONCLUSIONS

We have introduced a new 2-D multiscale wavelet representation, the dual-tree QWT, that is par-

ticularly efficient for coherent processing of relative location information in images. This tight-frame

transform generalizes complex wavelets to higher dimensions and inspires new processing and analysis

methods for wavelet phase.

Our development of the dual-tree QWT is based on an alternative definition of the 2-D HT and 2-D

analytic signal and on quaternion algebra. The resulting quaternion wavelets have three phase angles;

two of them encode phase shifts in an absolute horizontal/vertical coordinate system while the third

encodes textural information. The QWT’s approximate shifttheorem enables efficient and easy-to-use

analysis of the phase behavior around edge regions. We have developed a novel multiscale phase-based

disparity estimation scheme. Through efficient combination of disparity estimates across scale and wavelet

subbands, our algorithm clearly demonstrates the advantages of coherent processing in this new QWT

domain. Inherited from its complex counterpart, the QWT also features near shift-invariance and linear

computational complexity through its dual-tree implementation.

Beyond 2-D, the generalization of the Hilbert transform ton-D signals using hypercomplex numbers

can be used to develop higher dimensional wavelet transforms suitable for signals containing low-

dimensional manifold structures [46]. The QWT developed here could play an interesting role in the

analysis of (n− 2)-D manifold singularities inn-D space. This efficient hypercomplex wavelet represen-

tation could bring us new ways to solve high-dimensional signal compression and processing problems.

APPENDIX

A. Single Quadrant QFT of a QWT basis function

For a real-valued signalf(x) with QFT F q(u), we can derive the following relationships from (16):

Fq{j1f(x)} = j1Fq(u) (32)

Fq{j2f(x)} = F q(u)j2 (33)

Fq{j3f(x)} = j1Fq(u)j2. (34)

25

We begin by taking the QFT of (19). Equations (32)–(34) applyto the second to fourth components on the right

hand side of (19) respectively because(ψg(x)ψh(y), ψh(x)ψg(y), ψg(x)ψg(y)) are real-valued signals. Given the

QFT relationship of the four quadrature components ofψD(x, y) in Figure 6,Fq{ψD(x, y)} has support only on

a single quadrantS1. The same also holds for the quaternion wavelets in other subbands and scales.

B. Proof of QFT Plancharel Theorem

Starting from the space domain inner product in (23), we have∫

R2

f(x)w(x)d2x =

∫

R2

f(x)

(∫

R2

e−j12πuxW q(−u)e−j22πvyd2u

)

d2x

=

∫

R2

f(x)

(

∫

R2

(

e−j12πux(

cos(2πvy)W q(−u)

+j2 sin(−2πvy)α2(Wq(−u))

))

d2u

)

d2x

=

∫

R2

(

(

∫

R2

e−j12πuxf(x) cos(2πvy)d2x

)

W q(−u)

+(

∫

R2

e−j12πuxf(x)j2 sin(−2πvy)d2x

)

α2(Wq(−u))

)

d2u

=

∫

R2

F q·e(u)W q(−u) + F q

·o(u)α2(Wq(−u))d2

u

=

∫

R2

α3(Fq·e(u))W q(u) + α3(F

q·o(u))α2(W

q(u))d2u. (35)

One can also uses the QFT convolution theorem (Theorem 2.6) in [25] for this proof. The integrand on the right

hand side of (23) is the QFT of the functiong(x) = f(x) ∗ ∗ w(−x) where∗∗ denotes 2-D convolution.

C. QFT of a Step Edge

First, we express the step edge in Figure 8(a) as a 2-D separable function (a constant function along they-

direction multiplied by a 1-D step function along thex-direction). The QFT of such a function is−j1 δ(v)u

. Then,

applying the QFT affine theorem (Theorem 2.12 in [25]) with the transformation matrix involving rotationβ and

using the QFT shift theorem with offset (r1,r2) satisfying (27) yield the QFT of the step edge:

F q(u) = e−j12πur1

(

δ(−u sinβ + v cosβ)

2(u cosβ + v sinβ)(−j1 + j2)

+δ(u sinβ + v cosβ)

2(u cosβ − v sinβ)(−j1 − j2)

)

e−j22πvr2 . (36)

D. QWT Phase Angles for a Step Edge

This calculation combines the results from (23) and (36). Let Gq(u) be the integrand of the inner product formula

in (23) in the QFT domain. The integration limit only involves S1 because of the single-quadrant support of the

26

QWT basis. Consider the special case when the edge signalf(x) has zero offset (r = 0). Whentanβ > 0, its QFT

component inS1 is (−j1 + j2)H(u) whereH(u) is the component involving theδ-ridge in (36). Let the QFT of a

QWT basis,w(x), beW q(u) = a(u)+j1b(u)+j2c(u)+j3d(u). SubstitutingW q(u) andF q(u) = (−j1+j2)H(u)

into (23), the QWT coefficient can be expressed as∫

S1

Gq(u)d2u =

∫

S1

H(u)(c(u) − b(u))d2u − j1

∫

S1

H(u)(a(u) + d(u)))d2u

+j2

∫

S1

H(u)(a(u) + d(u)))d2u + j3

∫

S1

H(u)(c(u) − b(u))d2u

= A1 + j1B1 − j2B1 + j3A1, (37)

whereA1 andB1 are the integrals involvinga(u), b(u), etc. After normalizing (37) with its magnitude2(A21 +B2

1),

compute the third phase angle as

θ3 = −1

2arcsin

( −A21 −B2

1

2(A21 +B2

1)

)

=π

4. (38)

When tanβ < 0, F q(u) = (−j1 − j2)H(u) in S1, which givesθ3 = −π4 . Moreover, this special quaternion in

(37) with θ3 = π4 is in the singular case; as described in [25], its other phase angles(θ1, θ2) are non-unique but

the differenceθ1 − θ2 is unique with the following expression

θ1 − θ2 =1

2arcsin

(−2A1B1

A21 +B2

1

)

, (39)

whose value largely depends on the QFT spectrum of the basisw(x) and on the edge orientation and offset(β, r).

E. Theoretical Analysis of Leakage Effect

Consider the inner product of an edge signalf(x) and a QWT basisw(x) in both the main quadrant (S1) and

the leakage quadrant (S2) whentanβ > 0.4 We can express this inner product inS2 in a similar fashion as in (37)

and yield∫

S2

Gq(u)d2u = A2 + j1B2 + j2B2 − j3A2. (40)

Again, A2 and B2 are integrals involvinga(u), b(u), etc. Combining equation (37) and (40) gives the QWT

coefficient, i.e., the inner product between the non-ideal basis and the edge signal∫

S1

Gq(u)d2u +

∫

S2

Gq(u)d2u = (A1 +A2) + j1(B1 +B2) + j2(−B1 +B2) + j3(A1 −A2), (41)

whose magnitude is2(A21 +A2

2 +B21 +B2

2). Its third phase angle can be expressed as

θ3 = −1

2arcsin

(−2[(A21 +B2

1) − (A22 +B2

2)]

2(A21 +A2

2 +B21 +B2

2)

)

=1

2arcsin

(

1 − ε

1 + ε

)

,whereε =A2

2 +B22

A21 +B2

1

. (42)

In spite of leakage, the relationship between QWT phase angles (θ1,θ2) and edge offsetr still holds as in the

case without leakage, i.e.,θ1 ± θ2 varies linearly with edge offsetr. Sinceθ3 is not necessarilyπ4 for the QFT of

the edge signal, i.e., the singular case no longer holds, there exists unique expressions for bothθ1 andθ2. However,

our derivations show that the difference ofθ1 andθ2 is the same as in the ideal case (without leakage) as in (39).

4Leakage quadrant can be eitherS2 or S4 depending on the spectral support of the basis elementw(x).

27

ACKNOWLEDGMENTS

While this paper was in final preparation, Hyeokho Choi passed away. We will forever remember his broad

vision, his keen insights, and our lively discussions. His legacy will live on through his many contributions to the

signal processing community. We thank Ivan Selesnick for many discussions and his Matlab dual-tree CWT code,

Nick Kingsbury for discussions on CWT-based image registration, T. Gautama et al. for providing us with their

image sequences, J. Barron et al. for making their image sequences publicly available, and V. Argyriou et al. who

generously shared their code for gradient correlation motion estimation for video sequences. Thanks also to the

reviewers for their insightful comments and for providing amore concise derivation for the single-quadrant QFT

of a QWT basis in Appendix A and a more general Plancharel theorem in Appendix B.

REFERENCES

[1] J. Shapiro, “Embedded Image Coding using Zerotrees of Wavelet Coefficients,”IEEE Trans. Signal Processing, vol. 41,pp. 3445–3462, April 1993.

[2] M. Crouse, R. Nowak, and R. Baraniuk, “Wavelet-Based Statistical Signal Processing Using Hidden Markov Models,”IEEE Trans. Signal Processing, vol. 46, pp. 886–902, April 1998.

[3] H. Choi, J. Romberg, R. Baraniuk, and N. Kingsbury, “Hidden Markov tree modeling of Complex Wavelet Transforms,”IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 133–136, June 2000.

[4] L. Sendur and I. W. Selesnick, “Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency,”IEEE Trans. Signal Processing, vol. 50, pp. 2744–2756, November 2002.

[5] J. Weng, “Image matching using the windowed Fourier phase,” Int. J. of Computer Vision, vol. 11, no. 3, pp. 211–236,1993.

[6] B. Zitova and J. Flusser, “Image registration Methods: ASurvey,” Image and Vision Computing, vol. 21, pp. 977–1000,2003.

[7] D. J. Fleet and A. D. Jepson, “Computation of component image velocity from local phase information,”Int. J. of ComputerVision, vol. 5, no. 1, pp. 77–104, 1990.

[8] A. Oppenheim and J. Lim, “The importance of phase in signals,” in Proceedings of the IEEE, vol. 69, pp. 529–541, 1981.[9] D. L. Donoho, “De-noising by soft-thresholding,”IEEE Trans. Info. Theory, vol. 41, no. 3, pp. 613–627, 1995.

[10] J. M. A. Grossmann, R. Kronland-Martinet, “Reading andunderstanding continuous wavelet transforms,” inWavelets:Time-Frequency methods and phase space, pp. 2–20, Berlin: Springer-Verlag, 1989.

[11] J. M. Lina and M. Mayrand, “Complex Daubechies wavelets,” J. App. Harm. Analysis, vol. 2, pp. 219–229, 1995.[12] B. Belzer, J.-M. Lina, and J. Villasenor, “Complex, linear-phase filters for efficient image coding,”IEEE Trans. on Signal

Processing, vol. 43, pp. 2425–2427, Oct. 1995.[13] H. F. Ates and M. T. Orchard, “A nonlinear image representation in wavelet domain using complex signals with single

quadrant spectrum,” inProc. Asilomar Conf. Signals, Systems, and Computers, 2003.[14] R. van Spaendonck, T. Blu, R. Baraniuk, and M. Vetterli,“Orthogonal Hilbert transform filter banks and wavelets,” in

Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 6, pp. 505–508, April 6-10 2003.[15] M. Wakin, M. Orchard, R. G. Baraniuk, and V. Chandrasekaran, “Phase and magnitude perceptual sensitivities in

nonredundant complex wavelet representations,” inProc. Asilomar Conf. Signals, Systems, and Computers, 2003.[16] F. Fernandes, M. Wakin, and R. G. Baraniuk, “Non-redundant, linear-phase, semi-orthogonal, directional complexwavelets,”

in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, (Montreal), May 2004.[17] N. G. Kingsbury, “Complex wavelets for shift invariantanalysis and filtering of signals,”J. App. Harm. Analysis, vol. 10,

pp. 234–253, May 2001.[18] J. Romberg, H. Choi, and R. Baraniuk, “Multiscale edge grammars for complex wavelet transforms,” inProc. of Intl. Conf.

on Image Processing, (Thessaloniki, Greece), pp. 614–617, Oct. 2001.[19] S. L. Hahn, “Multidimensional complex signals with single-orthant spectra,” inProceedings of the IEEE, vol. 80, pp. 1287–

1300, August 1992.[20] I. W. Selesnick and K. Y. Li, “Video denoising using 2-D and 3-D dual-tree complex wavelet transforms,” inProc. of

SPIE Wavelets X, vol. 76, (San Diego, CA), August 4-8 2003.[21] J. Romberg, M. Wakin, H. Choi, and R. Baraniuk, “A geometric hidden Markov tree wavelet model,” inProc. of SPIE

Wavelets X, (San Diego, CA), August 2003.[22] J. Romberg, H. Choi, R. Baraniuk, and N. Kingsbury, “A hidden Markov tree model for the Complex Wavelet Transform,”

tech. rep., Rice University, ECE, September 2002.

28

[23] A. F. Abdelnour and I. W. Selesnick, “Design of 2-band orthogonal near-symmetric CQF,”IEEE Int. Conf. on Acoustics,Speech, and Signal Processing, pp. 3693–3696, May 2001.

[24] N. G. Kingsbury, “A dual-tree complex wavelet transform with improved orthogonality and symmetry properties,” inProc.of IEEE Int. Conf. on Image Processing, vol. 2, (Vancouver), pp. 375–378, September 2000.

[25] T. Bulow, Hypercomplex Spectral Signal Representations for the Processing and Analysis of Images. PhD thesis, ChristianAlbrechts University, Kiel, Germany, 1999.

[26] P. Perona, “Steerable-scalable kernels for edge detection and junction analysis,” inEuropean Conference on ComputerVision, pp. 3–18, 1992.

[27] T. Bulow and G. Sommer, “A novel approach to the 2-D analytic signal,” in CAIP, (Ljubljana, Slovenia), 1999.[28] W. L. Chan, H. Choi, and R. G. Baraniuk, “Quaternion wavelets for image analysis and processing,” inProc. of IEEE Int.

Conf. on Image Processing, vol. 5, (Singapore), pp. 3057–3060, Oct. 2004.[29] W. L. Chan, H. Choi, and R. G. Baraniuk, “Coherent image processing using quaternion wavelets,” inProc. of SPIE

Wavelets XI, vol. 5914, 2005.[30] V. Argyriou and T. Vlachos, “Using gradient correlation for sub-pixel motion estimation of video sequences,”IEEE Int.

Conf. on Acoustics, Speech, and Signal Processing, pp. 329–332, May 2004.[31] E. Bayro-Corrochano, “The Theory and Use of the Quaternion Wavelet Transform,”J. Mathematical Imaging and Vision,

vol. 24, no. 1, pp. 19–35, 2006.[32] H. Ates,Modeling location information for wavelet image coding. PhD thesis, Princeton University, EE, 2003.[33] G. Hua, “Noncoherent image denoising,” Master’s thesis, Rice University, ECE, Houston, Texas, 2005.[34] M. Felsberg and G. Sommer, “The monogenic signal,”IEEE Trans. on Signal Processing, vol. 49, pp. 3136–3144, December

2001.[35] M. Felsberg,Low-Level Image Processing with the Structure Multivector. PhD thesis, Christian Albrechts University, Kiel,

Germany, 2002.[36] M. Vetterli and J. Kovacevic,Wavelets and Subband Coding. Englewood Cliffs, NJ: Prentice Hall, 1995.[37] I. Selesnick, R. Baraniuk, and N. Kingsbury, “The Dual-tree Complex Wavelet Transform,”IEEE Sig. Proc. Mag., pp. 123–

151, November 2005.[38] I. L. Kantor and A. S. Solodovnikov,Hypercomplex Numbers. Springer-Verlag, 1989.[39] M. Wakin, J. Romberg, H. Choi, and R. Baraniuk, “Rate-Distortion Optimized Image Compression using Wedgelets,” in

IEEE International Conference on Image Processing, September 2002.[40] http://sampl.ece.ohio-state.edu/data/motion/Heart/.[41] M. Hemmendroff, M. Anderson, T. Kronander, and H. Knutsson, “Phase-based multidimensional volume registration,”

IEEE Trans. Medical Imaging, vol. 21, pp. 1536–1543, December 2002.[42] N. Kingsbury, “Dual Tree Complex Wavelets (HASSIP Workshop).” http://www.eng.cam.ac.uk/˜ngk, September 2004.[43] M. Felsberg, “Optical flow estimation from monogenic phase.,” in1st International Workshop on Complex Motion (IWCM),

vol. LNCS 3417, (Gunzburg, Tyskland), October 2004. In press.[44] J. Zhou, Y. Xu, and X. K. Yang, “Quaternion wavelet phasebased stereo matching for uncalibrated images,”Pattern

Recognition letters, vol. 28, pp. 1509–1522, March 2007.[45] B. K. Horn and B. G. Schunck, “Determining optical flow,”Artificial Intelligence, vol. 17, pp. 185–203, 1981.[46] W. L. Chan, H. Choi, and R. Baraniuk, “Directional hypercomplex wavelets for multidimensional signal analysis and

processing,”IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 996–999, May 2004.

29

coherent multiscale image processing using dual-tree ...wailam/research/qwt_ip2007.pdf · coherent...

Documents