fast nonstationary preconditioned iterative methods for ill-posed

Fast nonstationary preconditionediterative methods for ill-posed problems,with application to image deblurring

Marco Donatelli

Dipartimento di Scienza e Alta TecnologiaUniversita dell’Insubria - Como

Join work with Martin Hanke (U. Mainz, Germany)

SLA 2014Kalamata, 8-12 September 2014

Outline

Iterative solution of ill-posed equations

A nonstationary preconditioned iteration

Application to image deblurring

The model problem

Consider the solution of ill-posed equations

Tx = y , (1)

where T : X → Y is a linear operator between Hilbert spaces.Assume that problem (1) has a solution x† of minimal norm.

GoalCompute an approximation of x† starting from approximate datay δ ∈ Y, instead of the exact data y ∈ Y, with

‖y δ − y‖ ≤ δ , (2)

where δ ≥ 0 is the corresponding noise level.

Regularization methods

The solution of Tx = y δ requires some sort of regularization

◮ Tikhonov regularization: usually expensive (one linear systemto solve per approximation) and difficult optimization of theregularization parameter.

◮ Iterative regularization: cheap (only matrix-vector products)and the regularization parameter is determined on the fly.

◮ Hybrid regularization: Tikhonov regularization on Krylovsubspaces of increasing dimension (O’Leary, Nagy, Reichel,Novati, . . . ).

Iterative regularization

◮ Landweber iteration: slow convergence.

◮ CGLS: (conjugate gradient for least squares) faster, but stillnot fast enough. . . .

Preconditioned regularization

Replace the original problem Tx = y δ with

P−1Tx = P−1y δ

such that

◮ inversion of P is cheap

◮ P ≈ T but not too much (T−1 unbounded while P−1 mustbe bounded!)

Preconditioners can be used to accelerate the convergence, but animprudent choice of preconditioner may spoil the achievable qualityof computed restorations.

Classical preconditioner

◮ Historically, the first attempt of this sort was by Hanke, Nagy,and Plemmons (1993): In that work

P = Cε,

where Cε is the optimal doubly circulant approximation of T ,with eigenvalues set to be one for frequencies above 1/ε.Very fast, but the choice of ε is delicate and not robust.

◮ Subsequently, other regularizing preconditioners have beensuggested: Kilmer and O’Leary (1999), Egger and Neubauer(2005), Brianzi, Di Benedetto, and Estatico (2008).

Outline




Nonstationary iterated Tikhonov regularization

Given x0 compute for n = 0, 1, 2, . . .

zn = (T ∗T + αnI )−1T ∗rn , rn = y δ − Txn , (3a)

xn+1 = xn + zn . (3b)

This is some sort of regularized iterative refinement.

Choices of αn:

◮ αn = α > 0, ∀n, stationary.

◮ αn = αqn where α > 0 and 0 < q ≤ 1, geometric sequence(fastest convergence), Groetsch and Hanke (1998).

T ∗T + αI and TT ∗ + αI could be expensive to invert!

The starting idea

Approximate T by C and iterate

hn = (C ∗C + αnI )−1C ∗rn , rn = y δ − Txn , (4)

xn+1 = xn + hn . (5)

Preconditioner ⇒ P = (C ∗C + αnI )−1C ∗

Differences to previous preconditioners:

◮ gradual approximation of the optimal regularization parameter

◮ nonstationary scheme, not to be used in combinationwith cgls

◮ essentially as fast as nonstationary iterated Tikhonovregularization

Nonstationary preconditioning

The iterative refinement applied to the error equation Ten ≈ rnis correct up to noise, hence consider instead

Cen ≈ rn , (6)

possibly tolerating a slightly larger misfit.

Adaptive parameter choice strategy for αn

Choose αn s.t. the (6) is solved up to a certain relative amount:

‖rn − Chn‖ = qn‖rn‖ , (7)

where qn < 1, but not too small.

Choice of qn and approximation of T

Assumption: To derive the parameter qn in (7) we impose

‖(C − T )z‖ ≤ ρ ‖Tz‖ , z ∈ X , (8)

for some 0 < ρ < 1/2.

Consequence: Choose αn s.t.

‖rn − Chn‖ = qn‖rn‖ ,

with qn > ρ+ (1 + ρ)δ/‖rn‖ (discrepancy principle).

The Algorithm

Choose τ = (1 + 2ρ)/(1 − 2ρ) and fix q ∈ (2ρ, 1).While ‖rn‖ > τδ, let τn = ‖rn‖/δ, and compute αn s.t.

‖rn−Chn‖ = qn‖rn‖ , qn = max{

q, 2ρ+(1+ρ)/τn}

. (9a)

Then, updatehn = (C ∗C + αnI )

−1C ∗rn , (9b)

xn+1 = xn + hn , rn+1 = y δ − Txn+1 . (9c)

Details

◮ The parameter q prevents that rn decreases too rapidly.

◮ The unique αn can be computed by Newton iteration.

Theoretical results

◮ The norm of the iteration error en = x† − xn decreasesmonotonically as long as

‖rn‖ ≤ τδ ≤ ‖rn−1‖, τ > 1 fixed

⇒ discrepancy principle.

◮ For exact data (δ = 0) the iterates xn converges to a solutionx† of Tx = y .

◮ For noisy data (δ > 0), with the above stopping criterion, theresulting approximation xδ converges to x† as δ → 0.

Outline




Image deblurring problems

yδ = T ∗ x + e

◮ T is large and severely ill-conditioned (discretizzation of aFredholm integral equations of the first kind)

◮ yδ known, measured data (blurred and noisy image)

◮ e (white Gaussian) noise, s.t. ‖e‖ = δ

Boundary Conditions (BCs)

zero Dirichlet Periodic

Reflective Antireflective

The matrix C

Space invariant point spread function (PSF)

⇓

T has a doubly Toeplitz-like structure that carries the “correct”boundary conditions.

◮ doubly circulant matrix C diagonalizzable by FFT, thatcorresponds to periodic BCs.

◮ The boundary conditions have a very local effect

T− C = E+ R , (10)

where E is of small norm and R of small rank.

Choice of parameters

◮ We do not know whether a rigorous estimate of

‖(T − C)z‖ ≤ ρ ‖Tz‖

will hold for all (relevant) vectors z and ρ < 1/2.According to our numerical tests, the estimate (??) is satisfiedwith ρ of a few percent, i.e., 10−3 or 10−2.

◮ q = 0.7, but all q ∈ [ 0.6, 0.8 ] gives comparable results.

Comparison with other methods:

◮ CGLS, P-CGLS, our algorithm with αn = 0.5 qn (‖T‖∞ ≈ 1).

◮ Stopping by discrepancy principle with τ = 1.01.

Numerical results

◮ We add to the blurred image a white Gaussian noise with therelative amount of noise

ν =δ

‖y‖

◮ To compare the quality of the restorations, we evaluate theirrelative restoration errors (RRE), i.e.,

RRE =‖x− x†‖

‖x†‖,

where x is the computed solution.

◮ CGLS and P-CGLS from the Matlab Toolbox RestoreToolsJ. Nagy et al. http://www.mathcs.emory.edu/~nagy/RestoreTools

Example 1

Size 256× 256, ν = 0.5%, ρ = 10−3, zero Dirichlet BCs.

True image PSF Observed image

RRE and αn

0 20 40 60 800.3

0.4

0.5

0.6

0.7

Algor. IGeometricCGLSPCGLS

0 5 10 15 20 2510

−4

10−3

10−2

10−1

100

Algor. IGeometric

RRE vs iterations αn vs n

◮ Algor. I is our nonstationary scheme with the automaticselection of αn by Newton.

◮ Geometric is our iteration but with αn = 0.5 qn.

Check of the assumption

Recall: αn chosen s.t. ‖rn − Chn‖ = qn‖rn‖ , whereqn = max

{

q, 2ρ+ (1 + ρ)/τn}

, under the assumption

‖(C − T )z‖ ≤ ρ ‖Tz‖ , ρ = 10−3

0 2 4 6 8 10

0.2

0.4

0.6

0.8

1

‖rn − Cen‖/‖rn‖qn

0 2 4 6 8

0.2

0.4

0.6

0.8

1

‖rn − Cen‖/‖rn‖qn

q = 0.7 q = 0.6

Example 2

◮ Size 452 × 452, ν = 1%, Antireflective BCs.

◮ The structure of T is more involved ⇒ ρ = 10−2

−5 0 5

5

0

−5


Avoid T∗

◮ Restorations in the range of T∗ come with unwantedboundary artefacts [D., Serra-Capizzano IP 2005].

◮ RestoreTools implements the rebluring strategy: replaces T∗

with T′ by PSF rotated by 180o, but T′T is not symmetric.

0 2 4 6 8 100.11

0.12

0.13

0.14

0.15

0.16

Algor. IGeometricCGLS

0 2 4 6 8 101

1.4

2

3

4

Algor. IGeometricCGLS

%

RRE vs iterations ‖rn‖/‖yδ‖ vs n

Restored images

CGLS stopped at the minimum RRE.

Our Algorithm Geometric Sequence cgls

RRE = 0.110, it. 6 RRE = 0.110, it. 9 RRE = 0.131, it. 5

Conclusions

◮ Under the assumption that an approximation C of T isavailable, our new scheme turns out to be fast and stable.

◮ The choice of ρ reflect how much we trust in the previousapproximation.

◮ A too small ρ can be detected by αn or ‖rn‖.

◮ The choice of q is not crucial (q = 0.7 balance between speedand stability).

◮ Our scheme does not require T ∗.

References

M. Donatelli and M. HankeFast nonstationary preconditioned iterative methods forill-posed problems, with application to image deblurring,Inverse Problems, 29 (2013) 095008.

The choice of αn is inspired by

M. HankeA regularizing Levenberg-Marquardt scheme, with applicationsto inverse groundwater filtration problems,Inverse Problems, 13 (1997), pp. 79–95.

Example 3

Size 237× 237, ν = 0.1%, ρ = 10−3, zero Dirichlet BCs.


Detect that ρ = 10−3 is too small for the assumption

‖rn‖ αn

0 5 10 15 20 25 30

0.1

1

10

Geometricρ = 0.01ρ = 0.001

%

0 5 10 15 20 2510

−6

10−4

10−2

100

Geometricρ = 0.01ρ = 0.001

◮ The algorithm often recovers (RRE = 0.254).

◮ Increasing ρ = 10−2 ⇒ the zigzagging disappears and theAlgorithm terminates with RRE = 0.261.

◮ Different ρ only affects the final stage of the iteration.

fast nonstationary preconditioned iterative methods for ill-posed

Documents