towards solution of large scale image restoration and

70
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation Towards Solution of Large Scale Image Restoration and Reconstruction Problems Rosemary Renaut Joint work with Anne Gelb, Aditya Viswanathan, Hongbin Guo, Doug Cochran,Youzuo Lin, Arizona State University Jodi Mead, Boise State University November 4, 2009 National Science Foundation: Division of Computational Mathematics 1 / 49

Upload: others

Post on 03-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Towards Solution of Large Scale Image Restoration and ReconstructionProblems

Rosemary RenautJoint work with Anne Gelb, Aditya Viswanathan, Hongbin Guo, Doug Cochran,Youzuo

Lin,Arizona State University

Jodi Mead, Boise State University

November 4, 2009

National Science Foundation: Division of Computational Mathematics 1 / 49

Page 2: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Outline

1 MotivationQuick Review

2 Statistical Results for Least SquaresSummary of LS Statistical ResultsImplications of Statistical Results for Regularized Least Squares

3 Newton algorithmAlgorithm with LSQR (Paige and Saunders)Results

4 Large Scale ProblemsApplication in Image Reconstruction and Restoration

5 Stopping the EM Algorithm Statistically

6 Edge Detection for PSF Estimation

National Science Foundation: Division of Computational Mathematics 2 / 49

Page 3: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Signal/Image Restoration:

Integral Model of Signal Degradation b(t) =RK(t, s)x(s)ds

K(t, s) describes blur of the signal.

Convolutional model: invariant K(t, s) = K(t− s) is Point Spread Function (PSF).

Typically sampling includes noise e(t), model is

b(t) =

ZK(t− s)x(s)ds+ e(t)

Discrete model: given discrete samples b, find samples x of xLet A discretize K, assume known, model is given by

b = Ax + e.

Naıvely invert the system to find x!

National Science Foundation: Division of Computational Mathematics 3 / 49

Page 4: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Example 1-D Original and Blurred Noisy Signal

Original signal x. Blurred and noisy signal bGaussian PSF.

National Science Foundation: Division of Computational Mathematics 4 / 49

Page 5: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

The Solution: Regularization is needed

Naıve Solution A Regularized Solution

National Science Foundation: Division of Computational Mathematics 5 / 49

Page 6: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Least Squares for Ax = b: A Quick Review

Background

Consider discrete systems: A ∈ Rm×n, b ∈ Rm, x ∈ Rn

Ax = b + e,

Classical Approach Linear Least Squares (A full rank)

xLS = arg minx||Ax− b||22

Difficulty xLS sensitive to changes in right hand side b when A is ill-conditioned.

For convolutional models system is ill-posed.

National Science Foundation: Division of Computational Mathematics 6 / 49

Page 7: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Introduce Regularization to Find Acceptable Solution

Weighted Fidelity with Regularization• Regularize

xRLS(λ) = arg minx{‖b−Ax‖2Wb

+ λ2R(x)},

Weighting matrix Wb

• R(x) is a regularization term

• λ is a regularization parameter which is unknown.Solution xRLS(λ)

depends on λ.depends on regularization operator Rdepends on the weighting matrix Wb

National Science Foundation: Division of Computational Mathematics 7 / 49

Page 8: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

The Weighting Matrix:

Some Assumptions for Multiple Data MeasurementsGiven multiple measurements of data b:

Usually error in b, e is an m−vector of random measurement errors with mean 0 andpositive definite covariance matrix Cb = E(eeT ).

For uncorrelated heteroskedastic measurements Cb is diagonal matrix of standarddeviations of the errors. (Colored noise)

For white noise Cb = σ2I .

Weighting by Wb = Cb−1 in data fit term, theoretically, e = Wb

1/2e are uncorrelated.

Difficulty if Wb increases ill-conditioning of A!

For images find Wb from the image data

National Science Foundation: Division of Computational Mathematics 8 / 49

Page 9: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Formulation: Generalized Tikhonov Regularization With Weighting

Use R(x) = ‖D(x− x0)‖2

x = argmin J(x) = argmin{‖Ax− b‖2Wb+ λ2‖D(x− x0)‖2}. (1)

D is a suitable operator, often derivative approximation.

AssumeN (A) ∩N (D) = {0}x0 is a reference solution, often x0 = 0, might need to be average solution.

Having found λ, the posterior inverse covariance matrix is

Wx = ATWbA+ λ2I

Posterior information can give some confidence on parameter estimates.

National Science Foundation: Division of Computational Mathematics 9 / 49

Page 10: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 11: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 12: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 13: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 14: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 15: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 16: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 17: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 18: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 19: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 20: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 21: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 22: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ is crucial: an example with D = I .

National Science Foundation: Division of Computational Mathematics 10 / 49

Page 23: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Choice of λ crucial: Different algorithms - different solutions.

Discrepancy Principle

Suppose noise is white: Cb = σ2bI .

Find λ such that the regularized residual satisfies

σ2b =

1

m‖b−Ax(λ)‖22.

Can be implemented by a Newton root finding algorithm.

But discrepancy principle typically oversmooths.

Others [Vog02]L-Curve

Generalized Cross Validation (GCV)

Unbiased Predictive Risk (UPRE)

National Science Foundation: Division of Computational Mathematics 11 / 49

Page 24: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Some standard approaches I: L-curve - Find the corner

Let r(λ) = (A(λ)−A)b: InfluenceMatrixA(λ) = A(ATWbA+ λ2DTD)−1AT

Plot

log(‖Dx‖), log(‖r(λ)‖)

Trade off contributions.

Expensive - requires range of λ.

GSVD makes calculations efficient.

Not statistically based

Find corner

No cornerNational Science Foundation: Division of Computational Mathematics 12 / 49

Page 25: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Generalized Cross-Validation (GCV)

LetA(λ) = A(ATWbA+ λ2DTD)−1AT

Can pick Wb = I .

Minimize GCV function

‖b−Ax(λ)‖2Wb

[trace(Im −A(λ))]2,

which estimates predictive risk.

Expensive - requires range of λ.

GSVD makes calculations efficient.

Requires minimum

Multiple minima

Sometimes flat

National Science Foundation: Division of Computational Mathematics 13 / 49

Page 26: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Unbiased Predictive Risk Estimation (UPRE)

Minimize expected value of predictiverisk: Minimize UPRE function

‖b−Ax(λ)‖2Wb

+2 trace(A(λ))−m

Expensive - requires range of λ.

GSVD makes calculations efficient.

Need estimate of traceMinimum needed

National Science Foundation: Division of Computational Mathematics 14 / 49

Page 27: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Background: Statistics of the Least Squares Problem

Theorem (Rao73)

Let r be the rank of A and for b ∼ N(Ax, σ2bI), (errors in measurements are normally

distributed with mean 0 and covariance σ2bI), then

J = minx‖Ax− b‖2 ∼ σ2

bχ2(m− r).

J follows a χ2 distribution with m− r degrees of freedom:Basically the Discrepancy Principle

Corollary (Weighted Least Squares)

For b ∼ N(Ax, Cb), Wb = Cb−1 then

J = minx‖Ax− b‖2Wb

∼ χ2(m− r).

National Science Foundation: Division of Computational Mathematics 15 / 49

Page 28: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Extension: Statistics of the Regularized Least Squares Problem

Thm: χ2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: WeightingMatrix on Regularization term.)

x = argmin JD(x) = argmin{‖Ax− b‖2Wb+ ‖(x− x0)‖2WD

}, WD = DTWxD. (2)

Assume

Wb and Wx are symmetric positive definite.

Problem is uniquely solvableN (A) ∩N (D) = {0}.Moore-Penrose generalized inverse of WD is CD

Statistics: Errors in the right hand side e ∼ N(0, Cb), and x0 is known so that(x− x0) = f ∼ N(0, CD),

x0 is the mean vector of the model parameters.

ThenJD(x(WD)) ∼ χ2(m+ p− n)

National Science Foundation: Division of Computational Mathematics 16 / 49

Page 29: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Significance of the χ2 result

JD ∼ χ2(m+ p− n)

For sufficiently large m = m+ p− n

E(J(x(WD))) = m+ p− n E(JJT ) = 2(m+ p− n)

Moreoverm−

√2mzα/2 < J(x(WD)) < m+

√2mzα/2. (3)

zα/2 is the relevant z-value for a χ2-distribution with m = m+ p− n degrees

National Science Foundation: Division of Computational Mathematics 17 / 49

Page 30: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

General Result [MR09b], [RHM09], [MR09a]

The Cost Functional follows a χ2 Statistical Distribution

If x0 is not the mean value, then we introduce a non-central χ2 distribution with centralityparameter c.

If the problem is rank deficient the degrees of freedom are reduced.

Suppose degrees of freedom m and centrality parameter c then

E(JD) = m+ c E(JDJTD) = 2(m) + 4c

Suggests: Try to find WD so that E(J) = m+ c

First find λ only.

Find Wx = λ2I

National Science Foundation: Division of Computational Mathematics 18 / 49

Page 31: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

What do we need to apply the Theory?

RequirementsCovariance Cb on data parameters b (or on model parameters x!)

A priori information x0, mean x.

But x (and hence x0) are not known.

If not known use repeated data measurements calculate Cb and mean b.

Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence

c = ‖c‖22 = ‖QUTWb1/2(b−Ax0)‖22

E(JD) = E(‖QUTWb1/2(b−Ax0)‖22) = m+ p− n+ ‖c‖22

Given the GSVD estimate the degrees of freedom m.

Then we can use E(J) to find λ

National Science Foundation: Division of Computational Mathematics 19 / 49

Page 32: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Assume x0 is the mean (experimentalists know something about model parameters)

DESIGNING THE ALGORITHM: IRecall: if Cb and Cx are good estimates of covariance

|JD(x)− (m+ p− n)|

should be small.Thus, let m = m+ p− n then we want

m−√

2mzα/2 < J(x(WD)) < m+√

2mzα/2.

zα/2 is the relevant z-value for a χ2-distribution with m degrees

GOALFind Wx to make (3) tight: Single Variable case find λ

JD(x(λ)) ≈ m

National Science Foundation: Division of Computational Mathematics 20 / 49

Page 33: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

A Newton-line search Algorithm to find λ = 1/σ. (Basic algebra)

Newton to Solve F (σ) = JD(σ)− m = 0

We use σ = 1/λ, and y(σ(k)) is the current solution for which

x(σ(k)) = y(σ(k)) + x0

Then∂

∂σJ(σ) = − 2

σ3‖Dy(σ)‖2 < 0

Hence we have a basic Newton Iteration

σ(k+1) = σ(k)(1 +1

2(σ(k)

‖Dy‖ )2(JD(σ(k))− m)).

National Science Foundation: Division of Computational Mathematics 21 / 49

Page 34: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Practical Details of Algorithm: Large Scale problems

AlgorithmInitialization

Convert generalized Tikhonov problem to standard form.( if L is not invertible you justneed to know how to find Ax and ATx, and the null space of L)

Use LSQR (Paige and Saunders) algorithm to find the bidiagonal matrix for the projectedproblem.

Obtain a solution of the bidiagonal problem for given initial σ.

Subsequent StepsIncrease dimension of space if needed with reuse of existing bidiagonalization. May alsouse smaller size system if appropriate.

Each σ calculation of algorithm reuses saved information from the Lancosbidiagonalization.

National Science Foundation: Division of Computational Mathematics 22 / 49

Page 35: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Illustrating the Results for Problem Size 512: Two Standard Test Problems

Comparison for noise level 10%. On left D = I and on right D is first derivative

Notice L-curve and χ2-LSQR perform well.

UPRE does not perform well.

National Science Foundation: Division of Computational Mathematics 23 / 49

Page 36: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Real Data: Seismic Signal Restoration

The Data Set and GoalReal data set of 48 signals of length 3000.

The point spread function is derived from the signals.

Calculate the signal variance pointwise over all 48 signals.

Goal: restore the signal x from Ax = b, where A is PSF matrix and b is given blurredsignal.

Method of Comparison- no exact solution known: use convergence with respect todownsampling.

National Science Foundation: Division of Computational Mathematics 24 / 49

Page 37: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Comparison High Resolution White noise

Greater contrast with χ2 . UPRE is insufficiently regularized.L-curve severely undersmooths (not shown). Parameters not consistent across resolutions

National Science Foundation: Division of Computational Mathematics 25 / 49

Page 38: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

THE UPRE SOLUTION: x0 = 0 White Noise

Regularization Parameters are consistent: σ = 0.01005 all resolutions

National Science Foundation: Division of Computational Mathematics 26 / 49

Page 39: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

THE LSQR Hybrid SOLUTION: White Noise

Regularization quite consistent resolution 2 to 100σ = 0.0000029, .0000029, .0000029, .0000057, .0000057

National Science Foundation: Division of Computational Mathematics 27 / 49

Page 40: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Illustrating the Deblurring Result: Problem Size 65536

Example taken from RESTORE TOOLS Nagy et al 2007-8: 15% Noise

Computational Cost is Minimal: Projected Problem Size is 15, λ = .58

National Science Foundation: Division of Computational Mathematics 28 / 49

Page 41: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Problem Grain noise 15% added : increasing subproblem size to validate againstincreasing subproblem size

(a) Signal to noise ratio 10 log10(1/e) relative error e (b) Regularization Parameter Against ProblemSize

National Science Foundation: Division of Computational Mathematics 29 / 49

Page 42: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Illustrating the progress of the Newton algorithm post LSQR

National Science Foundation: Division of Computational Mathematics 30 / 49

Page 43: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Illustrating the progress of the Newton algorithm with LSQR

National Science Foundation: Division of Computational Mathematics 31 / 49

Page 44: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Problem Grain noise 15% added for increasing subproblem size

Figure: Signal to noise ratio 10 log10(1/e) relative error e

National Science Foundation: Division of Computational Mathematics 32 / 49

Page 45: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

An Alternative Direction For Large Scale Problems : Domain Decomposition [Ren98]

Domain decomposition of x into several domains:

x = (xT1 ,xT2 , ...,x

Tp )T .

Corresponding to different splitting of image x, kernel operator A is split

A = (A1, A2, ..., Ap).

eg: Different Splitting Schemes

National Science Foundation: Division of Computational Mathematics 33 / 49

Page 46: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Formulation - Regularized Least Squares [LRG09]

The linear system Ax ≈ b is replaced with the split systems

Aiyi ≈ bi(x), bi(x) = b−Xj 6=i

Ajxj = b−Ax +Aixi.

Locally solve Ax ≈ b

minyi∈<ni

‖Aiyi − bi(x)‖2, 1 ≤ i ≤ p.

If the problem is ill-posed we have the regularized problem

minf

˘‖ Ax− b ‖22 +λ2‖Dx‖22

¯.

Similarly, we will have splitting on operator assuming local regularization„ADΛ

«=

„„A1

λ1D1

«„A2

λ2D2

«· · ·„ApλpDp

««.

Solve iteratively using novel updates for changing right hand sides, [Saa87], [CW97]Update Scheme - Global solution update from local solutions at step k to step k + 1

x(k+1) =

pXi=1

τ(k+1)i (xlocal)

(k+1)i ,

where (xlocal)(k+1)i = ((x

(k)1 )T , . . . , (x

(k)i−1)T , (y

(k+1)i )T , (x

(k)i+1)T , . . . , (x

(k)p )T )T

National Science Foundation: Division of Computational Mathematics 34 / 49

Page 47: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Feasibility of the Approach 1-D Phillips, size 1024, noise level 6% Regularization .25

(a) No Splitting Rel Error .0525 (b) 4 Sub Problems Rel Error .0499

National Science Foundation: Division of Computational Mathematics 35 / 49

Page 48: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

2-D PET Reconstruction, Size 64× 64, Noise Level 1.5% Regularization .2

(c) No Splitting SNR 11.73DB (d) 4 Sub Problems SNR 12.24DB

National Science Foundation: Division of Computational Mathematics 36 / 49

Page 49: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

A New Stopping Rule for the EM Method [GR09])

Quick Rationale PETWell-known that ML-EM method converges to overly noisy solutions, iterative methodshave to stop before convergence [SV82], [HHL92], [HL94].

Detected counts in tube i are Poisson with mean bi =Pj aijxj . Hence basic relationship

is b ≈ Ax, A is projection matrix, b are counts and x is the density to be reconstructed.

EM iteration: x(k+1) = (AT (b./(Ax(k)))). ∗ x(k).

(e) True (f) k = 95 (g) k = 500

National Science Foundation: Division of Computational Mathematics 37 / 49

Page 50: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

A New Estimate of the Stopping Level

Algorithm: For k until converged for m tubes

Calculate step of EM, x(k).

Update tube means b(k) = Ax(k). Bin all tubes to have b(k) > 20.

Calculate y = (b− b(k))./√

b(k), then y ∼ N (0, 1)

Calculate mean y and sample standard deviation s for yi, i = 1 : m.

Calculate α =√m− 1y/s, and pt(α), α ∼ t(m− 1) (t-student density with m− 1

degrees of freedom).

Calculate β = (m− 1)s2 and pN (β), β ∼ N (m− 1,p

2(m− 1)) (Gaussian densitymean m− 1).

Calculate likelihood of sampling α and β from the two distributions: l(k) = pt(α)pN (β).

When l(k) is maximum, l(k) < l(k−1). STOP Solution is x(k−1).

National Science Foundation: Division of Computational Mathematics 38 / 49

Page 51: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Simulations: Validation

Table: The best and the predicted stopping step for 11 simulations

Best 95 90 89 90 90 95 90 92 89 94 91

Pred 96 88 94 89 89 95 100 100 95 93 94

Figure: l(k) = pt(α)pN (β) for 500 steps. Maximized at k = 96.

National Science Foundation: Division of Computational Mathematics 39 / 49

Page 52: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Extension for Mammogram Denoising: Early Ideas

The ModelAssume blurring of the mammogram by PSF kernel K and measured image is b.

Deconvolve in the Fourier domain and invert to give noisy estimate of optical density d.

Each entry of d is a linear combination of x-ray energy with Poisson noise, and√

d isclose to normally distributed [ANS48].

To find true optical density x denoise the deconvolved d. Use total variation

minx

mXi=1

(√

di −√

xi)2 + λ2‖x‖TV , s. t. x ≥ 0.

Given knowledge of variance in noise in x-ray automatically select λ using statisticalestimation approach.

National Science Foundation: Division of Computational Mathematics 40 / 49

Page 53: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Trial Experiment: Use data set from UoFlorida (DDSM) cancer case 0001, left breast with CCscanning angle

(a) Original Image (b) Restored Image

Figure: Total yellow (calcification) reduced by deblurring. Rectangle at bottom rhs indicates deblurring

National Science Foundation: Division of Computational Mathematics 41 / 49

Page 54: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

PSF Estimation in Blurring Problems using Edge Detection( Cochran, Gelb, Viswanathan,Renaut, Stefan)

Given the blurring model (PSF convolution operator K) and x ∈ L2(−π, π) piecewise-smooth.We estimate the psf starting with 2N + 1 blurred Fourier coefficients b(j), j = −N, ..., N .

b = K ∗ x+ e

Principle:Apply a linear edge detector, denote by T . We shall assume that the edge detector can bewritten as a convolution with an appropriate kernel

T ∗ (K ∗ x+ e) = (K ∗ x+ e) ∗ T= x ∗K ∗ T + e ∗ T= (x ∗ T ) ∗K + e ∗ T≈ [x] ∗K + e

Here [x](s) is a jump function. For a jump discontinuity in a function the jump function at anypoint s only depends on the values of x at s+ and s−.

[x](s) := x(s+)− x(s−)

Hence, we observe shifted and scaled replicates of the psf.

National Science Foundation: Division of Computational Mathematics 42 / 49

Page 55: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Example (No Noise)

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

(a) True function

−3 −2 −1 0 1 2 3−0.5

0

0.5

1

1.5

x

h(x)

(b) Motion Blur PSF

−3 −2 −1 0 1 2 3−1.5

−1

−0.5

0

0.5

1

1.5

x

g(x)

Blurred Function

(c) Blurred Function

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

Blur after edge detection

True blur (normalized)

(d) After applying edge detection

Figure: Function subjected to motion blur, N = 128

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

(a) True function

−3 −2 −1 0 1 2 3−0.2

0

0.2

0.4

0.6

0.8

1

1.2

x

h(x)

(b) Gaussian Blur PSF

−3 −2 −1 0 1 2 3−1.5

−1

−0.5

0

0.5

1

1.5

2

x

g(x)

(c) Blurred Function

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

Blur after edge detection

True blur (normalized)

(d) After applying edge detection

Figure: Function subjected to Gaussian blur, N = 128

National Science Foundation: Division of Computational Mathematics 43 / 49

Page 56: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Representative Examples : Gaussian PSF

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

Blurred, noisy function

Blur after edge detection

True blur (normalized)

(a) Noisy blur estimation

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

Blurred, noisy function

Blur after edge detection

True blur (normalized)

(b) After low-pass filtering

Figure: Function subjected to Gaussian blur, N = 128

Complex noise distribution on Fourier coefficients – e ∼ N (0, 1.5(2N+1)2

)

Second picture subjected to low-pass (Gaussian) filtering

It is conceivable that parameter estimation for a Gaussian PSF can take into account theeffect of Gaussian filtering

National Science Foundation: Division of Computational Mathematics 44 / 49

Page 57: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Representative Examples: Motion Blur

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

Blurred, noisy function

Blur after edge detection

True blur (normalized)

(a) Noisy blur estimation

−3 −2 −1 0 1 2 3−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

f(x)

Function

Blurred, noisy function

Blur after edge detection

True blur (normalized)

(b) After TV denoising

Figure: Function subjected to Motion blur, N = 128

Cannot perform conventional low-pass filtering since blur is piecewise-smooth

We compute the noisy blur estimate for Fourier expansion of blurred jumpSN [b] ≈ [x] ∗K + e

Denoising problem formulation

minx

‖x− SN [b] ‖22 + λ2‖Dx ‖1.

National Science Foundation: Division of Computational Mathematics 45 / 49

Page 58: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Future Work Combining Approaches

Extend the parameter selection methods to the domain decomposition problems for largescale.

Use efficient schemes for large scale problems - eg right hand side updates

Extend to edge detection approaches

Use tensor product of the PSF for extension to 2D - is it feasible

Use parameter estimation techniques for the 2D problem

Further development of statistical techniques for estimating acceptable solutions.

National Science Foundation: Division of Computational Mathematics 46 / 49

Page 59: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Bibliography I

F. J. ANSCOMBE.The transformation of poisson, binomial and negative-binomial data.Biometrika, 35:246–254, 1948.

T. F. Chan and W. L. Wan.Analysis of projection methods for solving linear systems with multiple right-hand sides.1997.

H. Guo and R. A. Renaut.Revisiting stopping rules for iterative methods used in emission tomography: Analysis anddevelopments.Physics of Medicine and Biology, submitted, 2009.

H. M. Hudson, B. F. Hutton, and R. Larkin.Accelerated EM reconstruction using ordered subsets.J. Nucl. Med., 33:960, 1992.

H. M. Hudson and R. Larkin.Accelerated imaging reconstruction using orded subsets of projection data.IEEE Trans. Med. Imag., 13(4):601–609, 1994.

National Science Foundation: Division of Computational Mathematics 47 / 49

Page 60: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Bibliography II

Y. Lin, R. A. Renaut, and H. Guo.Multisplitting for regularized least squares.in prep., 2009.

J. Mead and R. A. Renaut.Least squares problems with inequality constraints as quadratic constraints.Linear Algebra and its Applications, 2009.

J. Mead and R. A. Renaut.A Newton root-finding algorithm for estimating the regularization parameter for solvingill-conditioned least squares problems.Inverse Problems, 25, 2009.

R. A. Renaut.A parallel multisplitting solution of the least squares problem.BIT, 1998.

R. A Renaut, I. Hnetynkova, and J. Mead.Regularization parameter estimation for large scale Tikhonov regularization using a prioriinformation.Computational Statistics and Data Analysis, 54(1), 2009.

National Science Foundation: Division of Computational Mathematics 48 / 49

Page 61: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Bibliography III

Y. Saad.On the Lanczos method for solving symmetric linear systems with several right-handsides.1987.

L. A. Shepp and Y. Vardi.Maximum likelihood reconstruction for emission tomography.IEEE Trans. Med. Imag., MI-1(2):113–122, Oct. 1982.

Curtis R. Vogel.Computational Methods for Inverse Problems.Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002.

National Science Foundation: Division of Computational Mathematics 49 / 49

Page 62: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Future Work

Other Results and Future WorkSoftware Package!

Diagonal Weighting Schemes

Edge preserving regularization - Total Variation

Better handling of Colored Noise.

Residual Periodogram for large scale.

National Science Foundation: Division of Computational Mathematics 50 / 49

Page 63: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Algorithm Using the GSVD

GSVD

Use GSVD of [Wb1/2A,D]

For γi the generalized singular values, and s = UTWb1/2r

m = m− n+ p

si = si/(γ2i σ

2x + 1), i = 1, . . . , p, ti = siγi.

Find root of

F (σx) =

pXi=1

(1

γ2i σ

2x + 1

)s2i +mX

i=n+1

s2i − m = 0

Equivalently: solve F = 0, where

F (σx) = sT s− m and F ′(σx) = −2σx‖t‖22.

National Science Foundation: Division of Computational Mathematics 51 / 49

Page 64: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Practical Details of AlgorithmFind the parameter

Step 1: Bracket the root by logarithmic search on σ to handle the asymptotes: yieldssigmamax and sigmamin

Step 2: Calculate step, with steepness controlled by tolD. Let t = Dy/σ(k), where y isthe current update, then

step =1

2(

1

max {‖t‖, tolD} )2(JD(σ(k))− m)

Step 3: Introduce line search α(k) in Newton

sigmanew = σ(k)(1 + α(k)step)

National Science Foundation: Division of Computational Mathematics 52 / 49

Page 65: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Key Aspects of the Proof I: The Functional J

Algebraic Simplifications: Rewrite functional as quadratic form

Regularized solution given in terms of regularization matrix R(WD)

x = x0 + (ATWbA+DTWxD)−1ATWbr, (4)

= x0 +R(WD)Wb1/2r, r = b−Ax0

= x0 + y(WD). (5)

R(WD) = (ATWbA+DTWxD)−1ATWb1/2 (6)

Functional is given in terms of influence matrix A(WD)

A(WD) = Wb1/2AR(WD) (7)

JD(x) = rTWb1/2(Im −A(WD))Wb

1/2r, let r = Wb1/2r (8)

= rT (Im −A(WD))r. AQuadraticForm (9)

National Science Foundation: Division of Computational Mathematics 53 / 49

Page 66: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Key Aspects of the Proof II : Properties of a Quadratic Form

χ2 distribution of Quadratic Forms xTPx for normal variables (Fisher-Cochran Theorem)

Components xi are independent normal variables xi ∼ N(0, 1), i = 1 : n.

A necessary and sufficient condition that xTPx has a central χ2 distribution is that P isidempotent, P 2 = P . In which case the degrees of freedom of χ2 isrank(P ) =trace(P ) = n. .

When the means of xi are µi 6= 0, xTPx has a non-central χ2 distribution, withnon-centrality parameter c = µTPµ

A χ2 random variable with n degrees of freedom and centrality parameter c has meann+ c and variance 2(n+ 2c).

National Science Foundation: Division of Computational Mathematics 54 / 49

Page 67: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Key Aspects of the Proof III: Requires the GSVD

Lemma

Assume invertibility and m ≥ n ≥ p. There exist unitary matrices U ∈ Rm×m, V ∈ Rp×p,and a nonsingular matrix X ∈ Rn×n such that

A = U

»Υ

0(m−n)×n

–XT D = V [M, 0p×(n−p)]X

T , (10)

Υ = diag(υ1, . . . , υp, 1, . . . , 1) ∈ Rn×n, M = diag(µ1, . . . , µp) ∈ Rp×p,

0 ≤ υ1 ≤ · · · ≤ υp ≤ 1, 1 ≥ µ1 ≥ · · · ≥ µp > 0,υ2i + µ2

i = 1, i = 1, . . . p.(11)

The Functional with the GSVDLet Q = diag(µ1, . . . , µp, 0n−p, Im−n)

then J = rT (Im −A(WD))r = ‖QUT r‖22 = ‖k‖22

National Science Foundation: Division of Computational Mathematics 55 / 49

Page 68: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

Proof IV: Statistical Distribution of the Weighted Residual

Covariance Structure

Errors in b are e ∼ N(0, Cb). Now b depends on x, b = Ax hence we can showb ∼ N(Ax0, Cb +ACDA

T ) (x0 is mean of x)

Residual r = b−Ax ∼ N(0, Cb +ACDAT ).

r = Wb1/2r ∼ N(0, I + ACDA

T ), A = Wb1/2A.

Use the GSVD

I + ACDAT = UQ−2UT , Q = diag(µ1, . . . , µp, In−p, Im−n)

Now k = QUT r then k ∼ N(0, QUT (UQ−2UT )UQ) ∼ N(0, Im)

But J = ‖QUT r‖2 = ‖k‖2, where k is the vector k excluding components p+ 1 : n.Thus

JD ∼ χ2(m+ p− n).

National Science Foundation: Division of Computational Mathematics 56 / 49

Page 69: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

When mean of the parameters is not known, or x0 = 0 is not the mean

Corollary: non-central χ2 distribution of the regularized functionalRecall

x = argmin JD(x) = argmin{‖Ax− b‖2Wb+ ‖(x− x0)‖2WD

}, WD = DTWxD.

Assume all assumptions as before, but x 6= x0 is the mean vector of the model parameters.Let

c = ‖c‖22 = ‖QUTWb1/2A(x− x0)‖22

ThenJD ∼ χ2(m+ p− n, c)

The functional at optimum follows a non central χ2 distribution

National Science Foundation: Division of Computational Mathematics 57 / 49

Page 70: Towards Solution of Large Scale Image Restoration and

Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation

A further result when A is not of full column rank

The Rank Deficient SolutionSuppose A is not full column rank. Then the filtered solution can be written in terms of theGSVD

xFILT(λ) =

pXi=p+1−r

γ2i

υi(γ2i + λ2)

sixi +nX

i=p+1

sixi =

pXi=1

fiυisixi +

nXi=p+1

sixi.

Here fi = 0, i = 1 : p− r, fi = γ2i /(γ

2i + λ2), i = p− r + 1 : p. This yields

J(xFILT(λ)) ∼ χ2(m− n+ r, c)

notice degrees of freedom are reduced.

National Science Foundation: Division of Computational Mathematics 58 / 49