ipam mga tutorial on feature extraction and denoising: a...

69
IPAM MGA Tutorial on Feature Extraction and Denoising: A Saga of u + v Models Naoki Saito [email protected] http://www.math.ucdavis.edu/˜saito/ Department of Mathematics University of California, Davis Sep. 2004 – p.1

Upload: others

Post on 06-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

IPAM MGA Tutorial on FeatureExtraction and Denoising: A Saga of

u + v ModelsNaoki Saito

[email protected]

http://www.math.ucdavis.edu/˜saito/

Department of Mathematics

University of California, Davis

Sep. 2004 – p.1

Page 2: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Outline

What Are Features and What Are Noise?

Some History

u+ v Models

Very Briefly, Harlan-Claerbout-Rocca Model

Mumford-Shah Model

Rudin-Osher-Fatemi Model and Total Variation

DeVore-Lucier Model and Besov Spaces

Sparsity

Sep. 2004 – p.2

Page 3: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Outline . . .

Basis Pursuit Denoising

Sparsity vs. Statistical Independence

BV via Cohen-Dahmen-Daubechies-DeVore

BV vs Besov

Very Briefly, Meyer-Vese-Osher Model for Texture

My Comments and Summary

Sep. 2004 – p.3

Page 4: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Acknowledgment

Yves Meyer

Hyeokho Choi (Rice)

Other authors of articles

IPAM/UCLA

NSF & ONR

Sep. 2004 – p.4

Page 5: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

What Are Features and What Are Noise?

To answer those questions, we need to specify our aim:

Approximation

Compression

Noise Removal (Denoising)

Object Detection

Classification/Discrimination

Regression

Sep. 2004 – p.5

Page 6: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Some History

Satosi Watanabe (circa 1981) characterized patternrecognition as a quest for minimum entropy by saying,“the essential nature of pattern recognition is . . . aconceptual adaptation to the empirical data in order tosee a form in them. The form means a structure whichalways entails small entropy values.”

Raphy Coifman (circa 1991) suggested that “noise”should be defined as incoherent components in dataused to represent data whereas “signal” or “features”are coherent components both relative to waveformlibraries. coherent ≈ sparse ≈ focused

Sep. 2004 – p.6

Page 7: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

u+ v Models

Yves Meyer (2001) describes the so-called u+ v models

The u component is aimed at modeling the objects orimportant features

The v component represents textures and noise

The refined model is the u+ v + w model where v andw represent textures and noise, respectively.

Examples include:

Harlan, Claerbout, & Rocca (1984)

Mumford & Shah (1985)

Rudin, Osher, & Fatemi (1991)Sep. 2004 – p.7

Page 8: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

u+ v Models . . .

DeVore & Lucier (1992)

Chen, Donoho, & Saunders (1995)

Olshausen & Field (1996)

Coifman & Sowa (1998)

Donoho, Huo, & Starck (2000)

Cohen, Dahmen, Daubechies, DeVore (2000)

Meyer, Vese, Osher (2002)

many others . . .

Sep. 2004 – p.8

Page 9: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Common Intuitions

Should be able to represent recognizablepatterns/structures in data efficiently and compactly viasome invertible transform

u = signal ≈ features ⇐⇒ sharply focused ≈ sparse

v = noise ⇐⇒ defocused/diffused

Sep. 2004 – p.9

Page 10: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

What is u?

Requires some regularity (e.g., smoothness), i.e.,‖u‖B < C, where B some appropriate function space,and C > 0.

More general approach ‖Au‖B′ < C, where A : B → B′

is some invertible transform (e.g., A = Radontransform)

An important problem (modeling) is what B should befor various natural images.

Sep. 2004 – p.10

Page 11: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Various Viewpoints

Harmonic Analysis approach (Cohen, Coifman,Daubechies, DeVore, Donoho, Meyer, . . . )

PDE approach (Chan, Osher, Meyer, Morel, Sapiro,Vese, . . . )

Deterministic approach (Cohen, DeVore, Donoho,Osher, Terzopoulos, . . . )

Stochastic approach (Mumford, Grenander, Donoho,Zhu, Wu, . . . )

Highly active area and more and more interactionsamong various schools

Sep. 2004 – p.11

Page 12: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Harlan, Claerbout, & Rocca (1984)

A stacked seismic section =∑

of

Geologic component ≈ linear events

Diffraction component ≈ hyperbolic events

Noise component ≈ white Gaussian noise + α.

Sep. 2004 – p.12

Page 13: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .

(a) Original

(b) Geology

(c) Diffraction (d) Noise

Sep. 2004 – p.13

Page 14: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .

(a) Original (b) Geology

(c) Diffraction (d) Noise

Sep. 2004 – p.13

Page 15: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .

(a) Original (b) Geology

(c) Diffraction

(d) Noise

Sep. 2004 – p.13

Page 16: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .

(a) Original (b) Geology

(c) Diffraction (d) Noise

Sep. 2004 – p.13

Page 17: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .

(a) Original (b) Geology

(c) Diffraction (d) Noise

Sep. 2004 – p.13

Page 18: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Mumford & Shah (1985)

Motivation: simultaneous image segmentation anddenoising

Let Ω = [0, 1] × [0, 1] ⊂ R2

Sep. 2004 – p.14

Page 19: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Mumford & Shah (1985) . . .

Let u component is smooth everywhere except on acompact set K ∈ Ω, which is unknown.

Find u and K from the data f = u+ v by minimizing:

JMS(u,K) =

Ω

|f(x)−u(x)|2 dx+λ

Ω\K

|∇u(x)|2 dx+µH1(K),

where λ, µ are positive weights and H1 is the 1DHausdorff measure (total length) of K.

Measures fidelity, smoothness of u, simplicity of K,respectively.

Sep. 2004 – p.15

Page 20: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Mumford & Shah (1985) . . .

Since u|Ω\K ∈ H1(Ω \K) = W 1,2(Ω \K), the objectiveis: Find u ∈ L2(Ω) and K ⊂ Ω s.t.

infu∈L2(Ω)

‖f−u‖L2(Ω) subject to ‖u‖H1(Ω\K) < C and H1(K) < C ′.

Sep. 2004 – p.16

Page 21: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Mumford & Shah (1985) . . .

Sep. 2004 – p.17

Page 22: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Mumford & Shah (1985) . . .

Very influential in Computer Vision; refined by IvanLeclerc using MDL formalism; faster numericalalgorithm called ‘Graduated Non-Convexity’ (GNC)algorithm by Andrew Blake and Andrew Zisserman, . . .

Numerical optimization was and still is an issue

Choice of λ and µ

Representation/basis functions for u were not used

Sep. 2004 – p.18

Page 23: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Rudin, Osher, & Fatemi (1992)

Motivation: image enhancement and denoising

The MS functional is modified to:

JROF (u) =

Ω

|f(x) − u(x)|2 dx+ λ

Ω

|∇u(x)| dx

= ‖f − u‖2L2(Ω) + λ|u|BV (Ω),

where |u|BV (Ω) is the so-called total variation of u.

In other words, Find u ∈ L2(Ω) s.t.

infu∈L2(Ω)

‖f − u‖L2(Ω) subject to |u|BV (Ω) < C.

Sep. 2004 – p.19

Page 24: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Rudin, Osher, & Fatemi (1992) . . .

Solve the following Euler-Lagrange equation

u = f +λ

2∇ ·( ∇u|∇u|

)

,

by forming the evolution equation and compute thesolution as t→ ∞.

The coarea formula links MS to ROF: If |u|BV (Ω) <∞,

|u|BV (Ω) =

∫ ∞

−∞

H1(∂Et[u]) dt,

where Et[u]∆= x ∈ Rn : u(x) > t is the level set of u

at t and ∂Et[u] is its perimeter. Sep. 2004 – p.20

Page 25: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Rudin, Osher, & Fatemi (1992) . . .

Boundaries K are not explicit.

Choice of λ: dynamic, i.e., λ(t)

Strictly speaking, total variation is defined moregenerally and |u|BV (Ω) =

Ω|∇u| dx is true for

u ∈ L11(Ω) = W 1,1(Ω) ⊂ BV (Ω).

More about BV (Ω) ( the space of functions of boundedvariation) will be discussed later.

Sep. 2004 – p.21

Page 26: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Rudin, Osher, & Fatemi (1992) . . .

Sep. 2004 – p.22

Page 27: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Comments on MS and ROF models

Very influential; generated a new field “PDE-basedimage processing”

Characterization of the constraint space via basisfunctions were not used

Improved over the years; yet still computationallyintensive

Sep. 2004 – p.23

Page 28: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

DeVore & Lucier (1992)

Now, the functional becomes:

JDL(u) = ‖f − u‖2L2(Ω) + λ‖c[u]‖`1 ,

where c[u] = (cν [u])ν∈Γ is the expansion coefficients ofu relative to an orthonormal wavelet basis ψνν∈Γ, i.e.,cν [u] = 〈u, ψν〉.In other words, Find u ∈ L2(Ω) s.t.

infu∈L2(Ω)

‖f − u‖L2(Ω) subject to ‖c[u]‖`1 < C.

Sep. 2004 – p.24

Page 29: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

DeVore & Lucier (1992) . . .

Optimization leads to the soft thresholding (or waveletshrinkage) on the empirical wavelet coefficients

Let f(x) =∑

ν∈Γ cν [f ]ψν(x). Then,

JDL(u) =∑

ν∈Γ

(

(cν [f ] − cν [u])2 + λ|cν [u]|

)

,

whose minimization leads to:

cν [u] =

cν [f ] + λ/2 if cν [f ] < −λ/2,

0 if |cν [f ]| ≤ λ/2

cν [f ] − λ/2 if cν [f ] > λ/2.Sep. 2004 – p.25

Page 30: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

DeVore & Lucier (1992) . . .

(a) Noisy Lena (b) Linear (c) Nonlinear

Sep. 2004 – p.26

Page 31: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

DeVore & Lucier (1992) . . .

In the Besov space language:

‖c[u]‖`1 ‖u‖B1,1

1(Ω).

Thus, the constraint in optimization is ‖u‖B1,1

1(Ω) < C.

If v is WGN with mean 0, variance σ2, then the choiceof λ ≈ const · σ

N

logN 2, where N is the number ofsamples in each direction in Ω.

Lots of effort for deriving (near-)optimal threshold, e.g.,DeVore-Lucier, Donoho-Johnstone, and others, mostrecently Johnstone-Silverman.

Sep. 2004 – p.27

Page 32: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Besov Spaces

Let f ∈ Lp(Ω), 0 < p ≤ ∞. Let 0 < α <∞, and0 < q ≤ ∞.

Then, roughly speaking, functions belonging to thehomogeneous Besov space Bα,q

p (Ω) has “αderivatives” measured in Lp(Ω). The parameter qmakes finer distinctions in smoothness

The inhomogeneous Besov space

Bα,qp (Ω) = Bα,q

p (Ω) ∩ Lp(Ω)

‖f‖Bα,q

p (Ω) = ‖f‖Lp(Ω) + ‖f‖Bα,q

p (Ω)

Sep. 2004 – p.28

Page 33: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Besov Spaces . . .

Generalization of Lipschitz/Hölder and L2-Sobolevspaces because:

Bα,22 (Ω) = W α,2(Ω) = Hα(Ω)

Bα,∞∞ (Ω) = Λα(Ω) = Cα(Ω)

Easy to characterize via wavelet coefficients

‖f‖Bα,τ

τ ‖c[f ]‖`τ for τ = 2/(1 + α) in 2D.

Sep. 2004 – p.29

Page 34: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Besov Spaces . . .

Thus, in the specific DL model with α = 1 = τ , we have:

‖u‖B1,1

1

‖c[u]‖`1 .

This norm equivalence means that this DL model isreally seeking a function whose wavelet expansion issparse since ‖c[u]‖`1 < C is a form of sparsityconstraint.

Sep. 2004 – p.30

Page 35: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Sparsity via `p (quasi-) norm (0 < p ≤ 1)

Consider a vector or sequence x = (xj)j∈N.

Then consider the so-called `0 (quasi-)norm as themeasure of the sparsity of x:

‖x‖`0 = #j ∈ N : xj 6= 0.

This counts a number of nonzero components in x.

Thus, under, say, ‖x‖`2 = 1, the smaller ‖x‖`0 is, thesparser x is; a precise definition of sparsity.

However, this norm is too fragile to use (e.g., sensitivityto noise).

Sep. 2004 – p.31

Page 36: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Sparsity via `p (quasi-) norm (0 < p ≤ 1) . . .

Thus, consider the `p (quasi-) norm 0 < p ≤ 1 instead:

‖x‖`p =

(

j∈N

|xj|p)1/p

.

Instead of explicitly saying the number of nonzeros inthe sequence via `0 quasi-norm, we can say:

‖x‖p ≤ C =⇒ |x|(k) < Ck−1/p,

(∵ k|x|p(k) ≤∑k

j=1 |x|p(j) < Cp) which relates sparsity to

the decay of the magnitudes of the rearrangedsequence. The smaller p, the faster the decay.

Sep. 2004 – p.32

Page 37: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Sparsity via `p (quasi-) norm (0 < p ≤ 1) . . .

w`p, i.e., the weak `p space is defined as

w`p(N)∆= x = (x1, x2, . . .) : |x|(k) ≤ Ck−1/p,∃C > 0,∀k ∈ N.

Clearly, `p ( w`p, e.g., xn = n−1/p ∈ w`p, but not in `p.

Later w`1 will be used to “almost” characterize thespace BV (Ω).

Sep. 2004 – p.33

Page 38: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Disadvantages of B1,11

Even very simple cartoon-like images such as χE(x),where ∂E is a smooth closed curve, do not belong toB1,1

1

Oscillatory patterns do not belong to B1,11 either

Sep. 2004 – p.34

Page 39: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Disadvantages of B1,11 . . .

Choi and Baraniuk on the Besov spaces vswavelet-based statistical models (the parameters ofthe generalized Gaussian distribution ⇐⇒ the Besovparameters)

(a) Original u (b) Random

shuffles of cj,·

(c) Random sign

flips of c Sep. 2004 – p.35

Page 40: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Basis Pursuit Denoising (Chen, Donoho, & Saunders, 1995)

Under the discrete setting, assume thatf = u + v ∈ Rn, where v = σz is WGN vector withvariance σ2. The functional is similar to JDL:

JBP (u) = ‖f − u‖2`2 + λ‖α[u]‖`1 ,

The coefficient vector α[u] ∈ Rp are in the form:

u =∑

γ∈Γ

αγ [u]φγ ,

where φγγ∈Γ, |Γ| = p ≥ n is a dictionary of bases(i.e., redundant) such as stationary wavelet, waveletpackets, local Fourier bases, or other frames, etc.Sep. 2004 – p.36

Page 41: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Basis Pursuit Denoising . . .

Choice of λ = σ√

2 log p.

Solution of this convex non-quadratic optimization by“primal-dual log-barrier linear programming”

Specific combination of basis dictionaries, theuncertainty principle, equivalence with `0 minimizationproblem =⇒ ask Donoho, Candès, Huo, Elad, Starckwho are all participating in this program!

Sep. 2004 – p.37

Page 42: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Sparsity vs. Statistical Independence

Independent Component Analysis

Stochastic setting; a collection of images

Can find a basis (LSDB) that provides the leaststatistically-dependence (the best one) out of thewavelet packet library or local Fourier library

Better off to pursue the sparsity than independenceexcept the problems that really have statisticallyindependent sources

Read my articles as well as Donoho & Flesia for moreinfo.

Sep. 2004 – p.38

Page 43: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

BV (Ω): Functions of Bounded Variation

Definition of total variation

|u|BV (Ω)∆= sup

g

Ω

u∇ · g dx : g ∈ C1c (Ω; R2), |g(x)| ≤ 1 ∀x ∈ Ω

BV (Ω) ⊂ L1(Ω) is a Banach space with the norm:

‖u‖BV (Ω) = ‖u‖L1(Ω) + |u|BV (Ω).

If u ∈W 1,1(Ω) ⊂ BV (Ω), then

|u|BV (Ω) =

Ω

|∇u(x)| dx,

via integration by parts.Sep. 2004 – p.39

Page 44: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

BV (Ω) . . .

In this tutorial, Ω ⊂ R2 is bounded. Thus,

W 1,1(Ω) = BV(Ω) ⊂ BV (Ω) ⊂ L2(Ω) ⊂ L1(Ω).

Unfortunately, W 1,1(Ω) does not contains cartoon-likeimages such as χE(x) where E ⊂ Ω and ∂E is smoothwhereas BV (Ω) does.

Minimizer u∗ exists in BV (Ω) for the corrected versionof the JROF (u) (Vese, 2001)

However, the original version of the ROF criterion doesnot guarantee to get (u, v) = (χE, 0) from f = χE (Y.Meyer 2001).

Sep. 2004 – p.40

Page 45: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Besov vs BV

Embedding (DeVore-Lucier, Donoho, Meyer, . . . ):

B1,11 (Ω) ⊂ BV (Ω) ⊂ B1,∞

1 (Ω)

B1,11 does not contain cartoon-like images while BV (Ω)

does.

On the other hand, BV (Ω) does not possess anyunconditional basis while B1,1

1 does.

Sep. 2004 – p.41

Page 46: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Unconditional Bases

Let φνν∈Γ be a basis for a Banach space B.

Let f =∑

ν∈Γ cνφν ∈ B.

Let ‖f‖B be a functional norm of f ∈ B and let ‖c[f ]‖b

be a discrete sequence norm of c[f ] = (cν [f ]).

Suppose ‖f‖B ‖c[f ]‖b.

This is already not a trivial condition because . . .

Sep. 2004 – p.42

Page 47: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Fourier is not an unconditional basis for Lp(T ), p 6= 2

Let f ∈ B = Lp(T ), T = [0, 2π), and φν(x) = eiνx. Letcν [f ] be the Fourier coefficients of f .

Then, ‖f‖L2(T ) = ‖c[f ]‖`2(Z) (Plancherel)

However, ‖c[f ]‖`p(Z) does not tell information about‖f‖Lp(T ) if p 6= 2.

For example, ‖f‖L4(T ) tells you some info about thedistribution of the energy of f over T (∼ kurtosis).

However, |cν [f ]| does not tell you anything about‖f‖L4(T ).

=⇒ the Littlewood-Paley theorySep. 2004 – p.43

Page 48: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Unconditional Bases . . .

Then, φνν∈Γ is called an unconditional basis of B ifany sequence c = (cν) satisfying |cν | ≤ |cν [f ]|, ∀ν ∈ Γ,yields a new function f =

ν∈Γ cνφν that belong to B.

In other words, operations on the coefficients, such asshrinking, sign flips, do not change the membership ofB.

Examples: Fourier: L2, Wavelets: Lp, 1 < p <∞, Bα,qp ,

α > 0, 1 ≤ p, q ≤ ∞, . . .

Sep. 2004 – p.44

Page 49: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Unconditional Bases . . .

Advantages: φνν∈Γ ⇐⇒ axes of symmetry for the ballin B, e.g., ‖f‖B < C.

“Rotation” into a coordinate system where the norm is“diagonalized” even if the norm is not quadratic.

Read articles by Donoho as well as Meyer’s books!

Sep. 2004 – p.45

Page 50: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu

BV (Ω) does not have an unconditional basis; but wecan say the following:

u ∈ BV (Ω)=⇒c[u] ∈ w`1(Γ).

In other words, the sorted wavelet coefficients decayas O(k−1).

This implies that k-term approximation of aBV -function using wavelets is of O(k−1/2).

Sep. 2004 – p.46

Page 51: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu . . .

Embeddings in the sequence spaces:

`1(Γ) ⊂ bv(Γ) ⊂ w`1(Γ),

where bv(Γ) is a space of vectors consisting of thewavelet coefficients of BV (Ω) functions and its norm isdefined to be the BV norm of the correspondingfunction.

This allows wavelet shrinkage on the coefficients.

Sep. 2004 – p.47

Page 52: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu . . .

The Haar case by C-De-P-X (1998), and the generalwavelet case by Meyer (1998).

The stronger versions by C-Dah-Dau-De (2000).

Of course, using other methods such as ridgelets andcurvelets, one can get better decay =⇒ Lectures byDonoho and Candès tomorrow

Sep. 2004 – p.48

Page 53: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

BV for Image Modeling?

Study of Gousseau & Morel

BV (Ω) may be well adapted for large scale geometricstructures

But natural images are not in BV (Ω).

∵ Natural images often contain too many small objectsand textures =⇒ sum of the length of the perimeters ofthe level sets may blow up, i.e.,

∫∞

−∞H1(∂Et[u]) dt = ∞.

Sep. 2004 – p.49

Page 54: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

BV for Image Modeling?

20 40 60 80 100 120

20

40

60

80

100

120

(a)

(23,43)

20 40 60 80 100 120

20

40

60

80

100

120

(b)

(43,63)

20 40 60 80 100 120

20

40

60

80

100

120

(c)

(63,83)

20 40 60 80 100 120

20

40

60

80

100

120

(d)

(83,103)

20 40 60 80 100 120

20

40

60

80

100

120

(e)

(103,123)

20 40 60 80 100 120

20

40

60

80

100

120

(f)

(123,143)

20 40 60 80 100 120

20

40

60

80

100

120

(g)

(143,163)

20 40 60 80 100 120

20

40

60

80

100

120

(h)

(163,183)

20 40 60 80 100 120

20

40

60

80

100

120

(i)

(183,203)

20 40 60 80 100 120

20

40

60

80

100

120

(j)

(203,226)

Sep. 2004 – p.50

Page 55: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Meyer, Vese, & Osher (2002)

All the previous models with ‖f − u‖2L2(Ω) anticipated

WGN for the v component, which are also very muchrelated to statistical estimation methods such as MLE,Bayes, MDL, etc. with prior information on the ucomponent.

Improve the ROF model by changing the L2 norm ofv = f − u component to

JMV O(u) = ‖f − u‖G(Ω) + λ|u|BV (Ω),

where G(Ω) is a dual space of BV(Ω) = W 1,1(Ω).

Sep. 2004 – p.51

Page 56: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Very Briefly, Meyer, Vese, & Osher (2002)

G(Ω) contains oscillatory patterns (textures).

Y. Meyer’s book for precise definition of G(Ω)

Vese-Osher (2003) for numerical algorithm.

Sep. 2004 – p.52

Page 57: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Compare with the ROF model . . .

Sep. 2004 – p.53

Page 58: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Summary

Reviewed u+ v models

u component is often modeled by the constraints‖u‖B < C for some function space B.

This constraint corresponds to the prior information inBayesian statistics and MDL formalism, andregularization term in the inverse problem.

v component is often assumed to be i.i.d. WGN,yielding L2 fidelity term in the functional to beoptimized =⇒ not good for texture

Sep. 2004 – p.54

Page 59: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

Summary . . .

Wavelet shrinkage: works very well for B = B1,11 , and

reasonably well for B = BV , and computationally veryfast.

PDE-based approach: works well for B = W 1,1, morecomputationally intensive, but allows more flexiblemodeling for non-L2 error criterion for v.

Sep. 2004 – p.55

Page 60: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

My Comments

Use of function spaces for image modeling:

Mathematically sound

Can get deep results

Extremely hard to find a good one for naturalimages

Use of orthonormal bases:

Mathematically tractable

Good affinity with function spaces

Fast algorithms

But too restrictiveSep. 2004 – p.56

Page 61: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

My Comments . . .

Use of overcomplete dictionaries

Mathematically more challenging

Can get better results

Can develop fast algorithms

Still in the form of linear combinations in most cases

Sep. 2004 – p.57

Page 62: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

My Comments . . .

‖u‖B < C is mathematically great, but restrictive forimage modeling

Need more interaction with stochastic modelingcommunity

Explore more about wavelet shrinkage after theinvertible transform a la Harlan-Claerbout-Rocca

Yves Meyer: “Sparsity does not open the gate tofeature extraction.”

My reaction: “Sparsity can still open the gate to featureextraction.”

Sep. 2004 – p.58

Page 63: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Books and Survey Articles

R. DeVore: “Nonlinear Approximation,” in Acta Numerica,Cambridge Univ. Press, 1998.

D. Donoho, M. Vetterli, R. DeVore, & I. Daubechies: “Datacompression and harmonic analysis,” IEEE Trans. Info.Theory, vol.44, pp.2435–2476, 1998.

S. Jaffard, Y. Meyer, & R. D. Ryan: Wavelets: Tools forScience & Technology, SIAM, 2001.

S. Mallat: A Wavelet Tour of Signal Processing, 2nd ed.,Academic Press, 1999.

Y. Meyer: Oscillating Patterns in Image Processing andNonlinear Evolution Equations, University Lecture SeriesVol.22, AMS, 2001. Sep. 2004 – p.59

Page 64: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Articles

B. Bénichou & N. Saito: “Sparsity vs. statisticalindependence in adaptive signal representations: A casestudy of the spike process,” in Beyond Wavelets (G. V.Welland, ed.), Chap.9, pp.225–257, Academic Press, 2003.

H. Choi & R. Baraniuk: “Wavelet statistical models andBesov spaces,” in Nonlinear Estimation and Classification(D. Denison ed.), Springer-Verlag, 2003.

A. Cohen, R. DeVore, P. Petrushev, & H. Xu: “Nonlinearapproximation and the space BV (R2),” Amer. J. Math.,vol.121, pp.587–628, 1999.

Sep. 2004 – p.60

Page 65: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Articles . . .

A. Cohen, W. Dahmen, I. Daubechies, & R. DeVore:“Harmonic analysis of the space BV ,” Revista MathematicaIberoamericana, vol.19, pp.235–263, 2003.

S. Chen, D. L. Donoho, & M. A. Saunders: “Atomicdecomposition by basis pursuit,” SIAM J. Sci. Comput.,vol.20, pp.33-61, 1999.

R. A. DeVore & B. J. Lucier: “Fast wavelet techniques fornear-optimal image processing,” IEEE MilitaryCommunications Conference Record, pp.1129–1135, 1992.

D. L. Donoho: “Unconditional bases are optimal bases fordata compression and for statistical estimation,” Appl.Comput. Harm. Anal., vol.1, pp.100–115, 1993. Sep. 2004 – p.61

Page 66: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Articles . . .

D. L. Donoho: “Sparse components of images and optimalatomic decomposition,” Constr. Approx., vol.17, pp.353–382,2001.

D. L .Donoho & A. G. Flesia: “Can recent innovations inharmonic analysis ‘explain’ key findings in natural imagestatistics?” Network: Comput. Neural Syst., vol.12,pp.371–393, 2001.

Y. Gousseau & J.-M. Morel: “Are natural images of boundedvariation?” SIAM J. Math. Anal., vol.33, pp.634–648, 2001.

W. S. Harlan, J. F. Claerbout, and F. Rocca: “Signal/noiseseparation and velocity estimation,” Geophysics, vol.49,pp.1869–1880, 1984. Sep. 2004 – p.62

Page 67: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Articles . . .

D. Mumford & J. Shah: “Boundary detection by minimizingfunctionals, I” IEEE Conf. Computer Vision & PatternRecognition, pp.22–26, 1985.

L. Rudin, S. Osher, & E. Fatemi: “Nonlinear total variationbased noise removal algorithms,” Physica D, vol.60,pp.259–268, 1992.

N. Saito: “Simultaneous noise suppression and signalcompression using a library of orthonormal bases and theminimum description length criterion,” in Wavelets inGeophysics (E. Foufoula-Georgiou and P. Kumar, eds.),chap. XI, pp.299–324, Academic Press, 1994.

Sep. 2004 – p.63

Page 68: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Articles . . .

N. Saito: “Image approximation and modeling via leaststatistically-dependent bases,” Pattern Recognition, vol.34,pp.1765–1784, 2001.

N. Saito: “The generalized spike process, sparsity, andstatistical independence,” in Modern Signal Processing (D.Rockmore and D. Healy, Jr. eds.), MSRI Publications,vol.46, pp.317–340, Cambridge Univ. Press, 2004.

L. Vese: “A study in the BV space of a denoising-deblurringvariational problem,” Applied Mathematics & Optimization,vol.44, pp.131–161, 2001.

Sep. 2004 – p.64

Page 69: IPAM MGA Tutorial on Feature Extraction and Denoising: A ...helper.ipam.ucla.edu/publications/mgatut/mgatut_4976.pdf · Acknowledgment Yves Meyer Hyeokho Choi (Rice) Other authors

References: Articles . . .

L. Vese & S. J. Osher: “Modeling textures with total variationminimization and oscillating patterns in image processing,”J. Sci. Comput., vol.19, pp.553–572, 2003.

S. Watanabe: “Pattern recognition as a quest for minimumentropy,” Pattern Recognition, vol.13, no.5, pp.381–387,1981.

Sep. 2004 – p.65