towards solution of large scale image restoration and
TRANSCRIPT
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Towards Solution of Large Scale Image Restoration and ReconstructionProblems
Rosemary RenautJoint work with Anne Gelb, Aditya Viswanathan, Hongbin Guo, Doug Cochran,Youzuo
Lin,Arizona State University
Jodi Mead, Boise State University
November 4, 2009
National Science Foundation: Division of Computational Mathematics 1 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Outline
1 MotivationQuick Review
2 Statistical Results for Least SquaresSummary of LS Statistical ResultsImplications of Statistical Results for Regularized Least Squares
3 Newton algorithmAlgorithm with LSQR (Paige and Saunders)Results
4 Large Scale ProblemsApplication in Image Reconstruction and Restoration
5 Stopping the EM Algorithm Statistically
6 Edge Detection for PSF Estimation
National Science Foundation: Division of Computational Mathematics 2 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Signal/Image Restoration:
Integral Model of Signal Degradation b(t) =RK(t, s)x(s)ds
K(t, s) describes blur of the signal.
Convolutional model: invariant K(t, s) = K(t− s) is Point Spread Function (PSF).
Typically sampling includes noise e(t), model is
b(t) =
ZK(t− s)x(s)ds+ e(t)
Discrete model: given discrete samples b, find samples x of xLet A discretize K, assume known, model is given by
b = Ax + e.
Naıvely invert the system to find x!
National Science Foundation: Division of Computational Mathematics 3 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Example 1-D Original and Blurred Noisy Signal
Original signal x. Blurred and noisy signal bGaussian PSF.
National Science Foundation: Division of Computational Mathematics 4 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
The Solution: Regularization is needed
Naıve Solution A Regularized Solution
National Science Foundation: Division of Computational Mathematics 5 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Least Squares for Ax = b: A Quick Review
Background
Consider discrete systems: A ∈ Rm×n, b ∈ Rm, x ∈ Rn
Ax = b + e,
Classical Approach Linear Least Squares (A full rank)
xLS = arg minx||Ax− b||22
Difficulty xLS sensitive to changes in right hand side b when A is ill-conditioned.
For convolutional models system is ill-posed.
National Science Foundation: Division of Computational Mathematics 6 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Introduce Regularization to Find Acceptable Solution
Weighted Fidelity with Regularization• Regularize
xRLS(λ) = arg minx{‖b−Ax‖2Wb
+ λ2R(x)},
Weighting matrix Wb
• R(x) is a regularization term
• λ is a regularization parameter which is unknown.Solution xRLS(λ)
depends on λ.depends on regularization operator Rdepends on the weighting matrix Wb
National Science Foundation: Division of Computational Mathematics 7 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
The Weighting Matrix:
Some Assumptions for Multiple Data MeasurementsGiven multiple measurements of data b:
Usually error in b, e is an m−vector of random measurement errors with mean 0 andpositive definite covariance matrix Cb = E(eeT ).
For uncorrelated heteroskedastic measurements Cb is diagonal matrix of standarddeviations of the errors. (Colored noise)
For white noise Cb = σ2I .
Weighting by Wb = Cb−1 in data fit term, theoretically, e = Wb
1/2e are uncorrelated.
Difficulty if Wb increases ill-conditioning of A!
For images find Wb from the image data
National Science Foundation: Division of Computational Mathematics 8 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Formulation: Generalized Tikhonov Regularization With Weighting
Use R(x) = ‖D(x− x0)‖2
x = argmin J(x) = argmin{‖Ax− b‖2Wb+ λ2‖D(x− x0)‖2}. (1)
D is a suitable operator, often derivative approximation.
AssumeN (A) ∩N (D) = {0}x0 is a reference solution, often x0 = 0, might need to be average solution.
Having found λ, the posterior inverse covariance matrix is
Wx = ATWbA+ λ2I
Posterior information can give some confidence on parameter estimates.
National Science Foundation: Division of Computational Mathematics 9 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ is crucial: an example with D = I .
National Science Foundation: Division of Computational Mathematics 10 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Choice of λ crucial: Different algorithms - different solutions.
Discrepancy Principle
Suppose noise is white: Cb = σ2bI .
Find λ such that the regularized residual satisfies
σ2b =
1
m‖b−Ax(λ)‖22.
Can be implemented by a Newton root finding algorithm.
But discrepancy principle typically oversmooths.
Others [Vog02]L-Curve
Generalized Cross Validation (GCV)
Unbiased Predictive Risk (UPRE)
National Science Foundation: Division of Computational Mathematics 11 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Some standard approaches I: L-curve - Find the corner
Let r(λ) = (A(λ)−A)b: InfluenceMatrixA(λ) = A(ATWbA+ λ2DTD)−1AT
Plot
log(‖Dx‖), log(‖r(λ)‖)
Trade off contributions.
Expensive - requires range of λ.
GSVD makes calculations efficient.
Not statistically based
Find corner
No cornerNational Science Foundation: Division of Computational Mathematics 12 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Generalized Cross-Validation (GCV)
LetA(λ) = A(ATWbA+ λ2DTD)−1AT
Can pick Wb = I .
Minimize GCV function
‖b−Ax(λ)‖2Wb
[trace(Im −A(λ))]2,
which estimates predictive risk.
Expensive - requires range of λ.
GSVD makes calculations efficient.
Requires minimum
Multiple minima
Sometimes flat
National Science Foundation: Division of Computational Mathematics 13 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Unbiased Predictive Risk Estimation (UPRE)
Minimize expected value of predictiverisk: Minimize UPRE function
‖b−Ax(λ)‖2Wb
+2 trace(A(λ))−m
Expensive - requires range of λ.
GSVD makes calculations efficient.
Need estimate of traceMinimum needed
National Science Foundation: Division of Computational Mathematics 14 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Background: Statistics of the Least Squares Problem
Theorem (Rao73)
Let r be the rank of A and for b ∼ N(Ax, σ2bI), (errors in measurements are normally
distributed with mean 0 and covariance σ2bI), then
J = minx‖Ax− b‖2 ∼ σ2
bχ2(m− r).
J follows a χ2 distribution with m− r degrees of freedom:Basically the Discrepancy Principle
Corollary (Weighted Least Squares)
For b ∼ N(Ax, Cb), Wb = Cb−1 then
J = minx‖Ax− b‖2Wb
∼ χ2(m− r).
National Science Foundation: Division of Computational Mathematics 15 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Extension: Statistics of the Regularized Least Squares Problem
Thm: χ2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: WeightingMatrix on Regularization term.)
x = argmin JD(x) = argmin{‖Ax− b‖2Wb+ ‖(x− x0)‖2WD
}, WD = DTWxD. (2)
Assume
Wb and Wx are symmetric positive definite.
Problem is uniquely solvableN (A) ∩N (D) = {0}.Moore-Penrose generalized inverse of WD is CD
Statistics: Errors in the right hand side e ∼ N(0, Cb), and x0 is known so that(x− x0) = f ∼ N(0, CD),
x0 is the mean vector of the model parameters.
ThenJD(x(WD)) ∼ χ2(m+ p− n)
National Science Foundation: Division of Computational Mathematics 16 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Significance of the χ2 result
JD ∼ χ2(m+ p− n)
For sufficiently large m = m+ p− n
E(J(x(WD))) = m+ p− n E(JJT ) = 2(m+ p− n)
Moreoverm−
√2mzα/2 < J(x(WD)) < m+
√2mzα/2. (3)
zα/2 is the relevant z-value for a χ2-distribution with m = m+ p− n degrees
National Science Foundation: Division of Computational Mathematics 17 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
General Result [MR09b], [RHM09], [MR09a]
The Cost Functional follows a χ2 Statistical Distribution
If x0 is not the mean value, then we introduce a non-central χ2 distribution with centralityparameter c.
If the problem is rank deficient the degrees of freedom are reduced.
Suppose degrees of freedom m and centrality parameter c then
E(JD) = m+ c E(JDJTD) = 2(m) + 4c
Suggests: Try to find WD so that E(J) = m+ c
First find λ only.
Find Wx = λ2I
National Science Foundation: Division of Computational Mathematics 18 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
What do we need to apply the Theory?
RequirementsCovariance Cb on data parameters b (or on model parameters x!)
A priori information x0, mean x.
But x (and hence x0) are not known.
If not known use repeated data measurements calculate Cb and mean b.
Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence
c = ‖c‖22 = ‖QUTWb1/2(b−Ax0)‖22
E(JD) = E(‖QUTWb1/2(b−Ax0)‖22) = m+ p− n+ ‖c‖22
Given the GSVD estimate the degrees of freedom m.
Then we can use E(J) to find λ
National Science Foundation: Division of Computational Mathematics 19 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Assume x0 is the mean (experimentalists know something about model parameters)
DESIGNING THE ALGORITHM: IRecall: if Cb and Cx are good estimates of covariance
|JD(x)− (m+ p− n)|
should be small.Thus, let m = m+ p− n then we want
m−√
2mzα/2 < J(x(WD)) < m+√
2mzα/2.
zα/2 is the relevant z-value for a χ2-distribution with m degrees
GOALFind Wx to make (3) tight: Single Variable case find λ
JD(x(λ)) ≈ m
National Science Foundation: Division of Computational Mathematics 20 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
A Newton-line search Algorithm to find λ = 1/σ. (Basic algebra)
Newton to Solve F (σ) = JD(σ)− m = 0
We use σ = 1/λ, and y(σ(k)) is the current solution for which
x(σ(k)) = y(σ(k)) + x0
Then∂
∂σJ(σ) = − 2
σ3‖Dy(σ)‖2 < 0
Hence we have a basic Newton Iteration
σ(k+1) = σ(k)(1 +1
2(σ(k)
‖Dy‖ )2(JD(σ(k))− m)).
National Science Foundation: Division of Computational Mathematics 21 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Practical Details of Algorithm: Large Scale problems
AlgorithmInitialization
Convert generalized Tikhonov problem to standard form.( if L is not invertible you justneed to know how to find Ax and ATx, and the null space of L)
Use LSQR (Paige and Saunders) algorithm to find the bidiagonal matrix for the projectedproblem.
Obtain a solution of the bidiagonal problem for given initial σ.
Subsequent StepsIncrease dimension of space if needed with reuse of existing bidiagonalization. May alsouse smaller size system if appropriate.
Each σ calculation of algorithm reuses saved information from the Lancosbidiagonalization.
National Science Foundation: Division of Computational Mathematics 22 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Illustrating the Results for Problem Size 512: Two Standard Test Problems
Comparison for noise level 10%. On left D = I and on right D is first derivative
Notice L-curve and χ2-LSQR perform well.
UPRE does not perform well.
National Science Foundation: Division of Computational Mathematics 23 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Real Data: Seismic Signal Restoration
The Data Set and GoalReal data set of 48 signals of length 3000.
The point spread function is derived from the signals.
Calculate the signal variance pointwise over all 48 signals.
Goal: restore the signal x from Ax = b, where A is PSF matrix and b is given blurredsignal.
Method of Comparison- no exact solution known: use convergence with respect todownsampling.
National Science Foundation: Division of Computational Mathematics 24 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Comparison High Resolution White noise
Greater contrast with χ2 . UPRE is insufficiently regularized.L-curve severely undersmooths (not shown). Parameters not consistent across resolutions
National Science Foundation: Division of Computational Mathematics 25 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
THE UPRE SOLUTION: x0 = 0 White Noise
Regularization Parameters are consistent: σ = 0.01005 all resolutions
National Science Foundation: Division of Computational Mathematics 26 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
THE LSQR Hybrid SOLUTION: White Noise
Regularization quite consistent resolution 2 to 100σ = 0.0000029, .0000029, .0000029, .0000057, .0000057
National Science Foundation: Division of Computational Mathematics 27 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Illustrating the Deblurring Result: Problem Size 65536
Example taken from RESTORE TOOLS Nagy et al 2007-8: 15% Noise
Computational Cost is Minimal: Projected Problem Size is 15, λ = .58
National Science Foundation: Division of Computational Mathematics 28 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Problem Grain noise 15% added : increasing subproblem size to validate againstincreasing subproblem size
(a) Signal to noise ratio 10 log10(1/e) relative error e (b) Regularization Parameter Against ProblemSize
National Science Foundation: Division of Computational Mathematics 29 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Illustrating the progress of the Newton algorithm post LSQR
National Science Foundation: Division of Computational Mathematics 30 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Illustrating the progress of the Newton algorithm with LSQR
National Science Foundation: Division of Computational Mathematics 31 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Problem Grain noise 15% added for increasing subproblem size
Figure: Signal to noise ratio 10 log10(1/e) relative error e
National Science Foundation: Division of Computational Mathematics 32 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
An Alternative Direction For Large Scale Problems : Domain Decomposition [Ren98]
Domain decomposition of x into several domains:
x = (xT1 ,xT2 , ...,x
Tp )T .
Corresponding to different splitting of image x, kernel operator A is split
A = (A1, A2, ..., Ap).
eg: Different Splitting Schemes
National Science Foundation: Division of Computational Mathematics 33 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Formulation - Regularized Least Squares [LRG09]
The linear system Ax ≈ b is replaced with the split systems
Aiyi ≈ bi(x), bi(x) = b−Xj 6=i
Ajxj = b−Ax +Aixi.
Locally solve Ax ≈ b
minyi∈<ni
‖Aiyi − bi(x)‖2, 1 ≤ i ≤ p.
If the problem is ill-posed we have the regularized problem
minf
˘‖ Ax− b ‖22 +λ2‖Dx‖22
¯.
Similarly, we will have splitting on operator assuming local regularization„ADΛ
«=
„„A1
λ1D1
«„A2
λ2D2
«· · ·„ApλpDp
««.
Solve iteratively using novel updates for changing right hand sides, [Saa87], [CW97]Update Scheme - Global solution update from local solutions at step k to step k + 1
x(k+1) =
pXi=1
τ(k+1)i (xlocal)
(k+1)i ,
where (xlocal)(k+1)i = ((x
(k)1 )T , . . . , (x
(k)i−1)T , (y
(k+1)i )T , (x
(k)i+1)T , . . . , (x
(k)p )T )T
National Science Foundation: Division of Computational Mathematics 34 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Feasibility of the Approach 1-D Phillips, size 1024, noise level 6% Regularization .25
(a) No Splitting Rel Error .0525 (b) 4 Sub Problems Rel Error .0499
National Science Foundation: Division of Computational Mathematics 35 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
2-D PET Reconstruction, Size 64× 64, Noise Level 1.5% Regularization .2
(c) No Splitting SNR 11.73DB (d) 4 Sub Problems SNR 12.24DB
National Science Foundation: Division of Computational Mathematics 36 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
A New Stopping Rule for the EM Method [GR09])
Quick Rationale PETWell-known that ML-EM method converges to overly noisy solutions, iterative methodshave to stop before convergence [SV82], [HHL92], [HL94].
Detected counts in tube i are Poisson with mean bi =Pj aijxj . Hence basic relationship
is b ≈ Ax, A is projection matrix, b are counts and x is the density to be reconstructed.
EM iteration: x(k+1) = (AT (b./(Ax(k)))). ∗ x(k).
(e) True (f) k = 95 (g) k = 500
National Science Foundation: Division of Computational Mathematics 37 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
A New Estimate of the Stopping Level
Algorithm: For k until converged for m tubes
Calculate step of EM, x(k).
Update tube means b(k) = Ax(k). Bin all tubes to have b(k) > 20.
Calculate y = (b− b(k))./√
b(k), then y ∼ N (0, 1)
Calculate mean y and sample standard deviation s for yi, i = 1 : m.
Calculate α =√m− 1y/s, and pt(α), α ∼ t(m− 1) (t-student density with m− 1
degrees of freedom).
Calculate β = (m− 1)s2 and pN (β), β ∼ N (m− 1,p
2(m− 1)) (Gaussian densitymean m− 1).
Calculate likelihood of sampling α and β from the two distributions: l(k) = pt(α)pN (β).
When l(k) is maximum, l(k) < l(k−1). STOP Solution is x(k−1).
National Science Foundation: Division of Computational Mathematics 38 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Simulations: Validation
Table: The best and the predicted stopping step for 11 simulations
Best 95 90 89 90 90 95 90 92 89 94 91
Pred 96 88 94 89 89 95 100 100 95 93 94
Figure: l(k) = pt(α)pN (β) for 500 steps. Maximized at k = 96.
National Science Foundation: Division of Computational Mathematics 39 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Extension for Mammogram Denoising: Early Ideas
The ModelAssume blurring of the mammogram by PSF kernel K and measured image is b.
Deconvolve in the Fourier domain and invert to give noisy estimate of optical density d.
Each entry of d is a linear combination of x-ray energy with Poisson noise, and√
d isclose to normally distributed [ANS48].
To find true optical density x denoise the deconvolved d. Use total variation
minx
mXi=1
(√
di −√
xi)2 + λ2‖x‖TV , s. t. x ≥ 0.
Given knowledge of variance in noise in x-ray automatically select λ using statisticalestimation approach.
National Science Foundation: Division of Computational Mathematics 40 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Trial Experiment: Use data set from UoFlorida (DDSM) cancer case 0001, left breast with CCscanning angle
(a) Original Image (b) Restored Image
Figure: Total yellow (calcification) reduced by deblurring. Rectangle at bottom rhs indicates deblurring
National Science Foundation: Division of Computational Mathematics 41 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
PSF Estimation in Blurring Problems using Edge Detection( Cochran, Gelb, Viswanathan,Renaut, Stefan)
Given the blurring model (PSF convolution operator K) and x ∈ L2(−π, π) piecewise-smooth.We estimate the psf starting with 2N + 1 blurred Fourier coefficients b(j), j = −N, ..., N .
b = K ∗ x+ e
Principle:Apply a linear edge detector, denote by T . We shall assume that the edge detector can bewritten as a convolution with an appropriate kernel
T ∗ (K ∗ x+ e) = (K ∗ x+ e) ∗ T= x ∗K ∗ T + e ∗ T= (x ∗ T ) ∗K + e ∗ T≈ [x] ∗K + e
Here [x](s) is a jump function. For a jump discontinuity in a function the jump function at anypoint s only depends on the values of x at s+ and s−.
[x](s) := x(s+)− x(s−)
Hence, we observe shifted and scaled replicates of the psf.
National Science Foundation: Division of Computational Mathematics 42 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Example (No Noise)
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
(a) True function
−3 −2 −1 0 1 2 3−0.5
0
0.5
1
1.5
x
h(x)
(b) Motion Blur PSF
−3 −2 −1 0 1 2 3−1.5
−1
−0.5
0
0.5
1
1.5
x
g(x)
Blurred Function
(c) Blurred Function
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
Blur after edge detection
True blur (normalized)
(d) After applying edge detection
Figure: Function subjected to motion blur, N = 128
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
(a) True function
−3 −2 −1 0 1 2 3−0.2
0
0.2
0.4
0.6
0.8
1
1.2
x
h(x)
(b) Gaussian Blur PSF
−3 −2 −1 0 1 2 3−1.5
−1
−0.5
0
0.5
1
1.5
2
x
g(x)
(c) Blurred Function
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
Blur after edge detection
True blur (normalized)
(d) After applying edge detection
Figure: Function subjected to Gaussian blur, N = 128
National Science Foundation: Division of Computational Mathematics 43 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Representative Examples : Gaussian PSF
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
Blurred, noisy function
Blur after edge detection
True blur (normalized)
(a) Noisy blur estimation
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
Blurred, noisy function
Blur after edge detection
True blur (normalized)
(b) After low-pass filtering
Figure: Function subjected to Gaussian blur, N = 128
Complex noise distribution on Fourier coefficients – e ∼ N (0, 1.5(2N+1)2
)
Second picture subjected to low-pass (Gaussian) filtering
It is conceivable that parameter estimation for a Gaussian PSF can take into account theeffect of Gaussian filtering
National Science Foundation: Division of Computational Mathematics 44 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Representative Examples: Motion Blur
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
Blurred, noisy function
Blur after edge detection
True blur (normalized)
(a) Noisy blur estimation
−3 −2 −1 0 1 2 3−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
f(x)
Function
Blurred, noisy function
Blur after edge detection
True blur (normalized)
(b) After TV denoising
Figure: Function subjected to Motion blur, N = 128
Cannot perform conventional low-pass filtering since blur is piecewise-smooth
We compute the noisy blur estimate for Fourier expansion of blurred jumpSN [b] ≈ [x] ∗K + e
Denoising problem formulation
minx
‖x− SN [b] ‖22 + λ2‖Dx ‖1.
National Science Foundation: Division of Computational Mathematics 45 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Future Work Combining Approaches
Extend the parameter selection methods to the domain decomposition problems for largescale.
Use efficient schemes for large scale problems - eg right hand side updates
Extend to edge detection approaches
Use tensor product of the PSF for extension to 2D - is it feasible
Use parameter estimation techniques for the 2D problem
Further development of statistical techniques for estimating acceptable solutions.
National Science Foundation: Division of Computational Mathematics 46 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Bibliography I
F. J. ANSCOMBE.The transformation of poisson, binomial and negative-binomial data.Biometrika, 35:246–254, 1948.
T. F. Chan and W. L. Wan.Analysis of projection methods for solving linear systems with multiple right-hand sides.1997.
H. Guo and R. A. Renaut.Revisiting stopping rules for iterative methods used in emission tomography: Analysis anddevelopments.Physics of Medicine and Biology, submitted, 2009.
H. M. Hudson, B. F. Hutton, and R. Larkin.Accelerated EM reconstruction using ordered subsets.J. Nucl. Med., 33:960, 1992.
H. M. Hudson and R. Larkin.Accelerated imaging reconstruction using orded subsets of projection data.IEEE Trans. Med. Imag., 13(4):601–609, 1994.
National Science Foundation: Division of Computational Mathematics 47 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Bibliography II
Y. Lin, R. A. Renaut, and H. Guo.Multisplitting for regularized least squares.in prep., 2009.
J. Mead and R. A. Renaut.Least squares problems with inequality constraints as quadratic constraints.Linear Algebra and its Applications, 2009.
J. Mead and R. A. Renaut.A Newton root-finding algorithm for estimating the regularization parameter for solvingill-conditioned least squares problems.Inverse Problems, 25, 2009.
R. A. Renaut.A parallel multisplitting solution of the least squares problem.BIT, 1998.
R. A Renaut, I. Hnetynkova, and J. Mead.Regularization parameter estimation for large scale Tikhonov regularization using a prioriinformation.Computational Statistics and Data Analysis, 54(1), 2009.
National Science Foundation: Division of Computational Mathematics 48 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Bibliography III
Y. Saad.On the Lanczos method for solving symmetric linear systems with several right-handsides.1987.
L. A. Shepp and Y. Vardi.Maximum likelihood reconstruction for emission tomography.IEEE Trans. Med. Imag., MI-1(2):113–122, Oct. 1982.
Curtis R. Vogel.Computational Methods for Inverse Problems.Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002.
National Science Foundation: Division of Computational Mathematics 49 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Future Work
Other Results and Future WorkSoftware Package!
Diagonal Weighting Schemes
Edge preserving regularization - Total Variation
Better handling of Colored Noise.
Residual Periodogram for large scale.
National Science Foundation: Division of Computational Mathematics 50 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Algorithm Using the GSVD
GSVD
Use GSVD of [Wb1/2A,D]
For γi the generalized singular values, and s = UTWb1/2r
m = m− n+ p
si = si/(γ2i σ
2x + 1), i = 1, . . . , p, ti = siγi.
Find root of
F (σx) =
pXi=1
(1
γ2i σ
2x + 1
)s2i +mX
i=n+1
s2i − m = 0
Equivalently: solve F = 0, where
F (σx) = sT s− m and F ′(σx) = −2σx‖t‖22.
National Science Foundation: Division of Computational Mathematics 51 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Practical Details of AlgorithmFind the parameter
Step 1: Bracket the root by logarithmic search on σ to handle the asymptotes: yieldssigmamax and sigmamin
Step 2: Calculate step, with steepness controlled by tolD. Let t = Dy/σ(k), where y isthe current update, then
step =1
2(
1
max {‖t‖, tolD} )2(JD(σ(k))− m)
Step 3: Introduce line search α(k) in Newton
sigmanew = σ(k)(1 + α(k)step)
National Science Foundation: Division of Computational Mathematics 52 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Key Aspects of the Proof I: The Functional J
Algebraic Simplifications: Rewrite functional as quadratic form
Regularized solution given in terms of regularization matrix R(WD)
x = x0 + (ATWbA+DTWxD)−1ATWbr, (4)
= x0 +R(WD)Wb1/2r, r = b−Ax0
= x0 + y(WD). (5)
R(WD) = (ATWbA+DTWxD)−1ATWb1/2 (6)
Functional is given in terms of influence matrix A(WD)
A(WD) = Wb1/2AR(WD) (7)
JD(x) = rTWb1/2(Im −A(WD))Wb
1/2r, let r = Wb1/2r (8)
= rT (Im −A(WD))r. AQuadraticForm (9)
National Science Foundation: Division of Computational Mathematics 53 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Key Aspects of the Proof II : Properties of a Quadratic Form
χ2 distribution of Quadratic Forms xTPx for normal variables (Fisher-Cochran Theorem)
Components xi are independent normal variables xi ∼ N(0, 1), i = 1 : n.
A necessary and sufficient condition that xTPx has a central χ2 distribution is that P isidempotent, P 2 = P . In which case the degrees of freedom of χ2 isrank(P ) =trace(P ) = n. .
When the means of xi are µi 6= 0, xTPx has a non-central χ2 distribution, withnon-centrality parameter c = µTPµ
A χ2 random variable with n degrees of freedom and centrality parameter c has meann+ c and variance 2(n+ 2c).
National Science Foundation: Division of Computational Mathematics 54 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Key Aspects of the Proof III: Requires the GSVD
Lemma
Assume invertibility and m ≥ n ≥ p. There exist unitary matrices U ∈ Rm×m, V ∈ Rp×p,and a nonsingular matrix X ∈ Rn×n such that
A = U
»Υ
0(m−n)×n
–XT D = V [M, 0p×(n−p)]X
T , (10)
Υ = diag(υ1, . . . , υp, 1, . . . , 1) ∈ Rn×n, M = diag(µ1, . . . , µp) ∈ Rp×p,
0 ≤ υ1 ≤ · · · ≤ υp ≤ 1, 1 ≥ µ1 ≥ · · · ≥ µp > 0,υ2i + µ2
i = 1, i = 1, . . . p.(11)
The Functional with the GSVDLet Q = diag(µ1, . . . , µp, 0n−p, Im−n)
then J = rT (Im −A(WD))r = ‖QUT r‖22 = ‖k‖22
National Science Foundation: Division of Computational Mathematics 55 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
Proof IV: Statistical Distribution of the Weighted Residual
Covariance Structure
Errors in b are e ∼ N(0, Cb). Now b depends on x, b = Ax hence we can showb ∼ N(Ax0, Cb +ACDA
T ) (x0 is mean of x)
Residual r = b−Ax ∼ N(0, Cb +ACDAT ).
r = Wb1/2r ∼ N(0, I + ACDA
T ), A = Wb1/2A.
Use the GSVD
I + ACDAT = UQ−2UT , Q = diag(µ1, . . . , µp, In−p, Im−n)
Now k = QUT r then k ∼ N(0, QUT (UQ−2UT )UQ) ∼ N(0, Im)
But J = ‖QUT r‖2 = ‖k‖2, where k is the vector k excluding components p+ 1 : n.Thus
JD ∼ χ2(m+ p− n).
National Science Foundation: Division of Computational Mathematics 56 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
When mean of the parameters is not known, or x0 = 0 is not the mean
Corollary: non-central χ2 distribution of the regularized functionalRecall
x = argmin JD(x) = argmin{‖Ax− b‖2Wb+ ‖(x− x0)‖2WD
}, WD = DTWxD.
Assume all assumptions as before, but x 6= x0 is the mean vector of the model parameters.Let
c = ‖c‖22 = ‖QUTWb1/2A(x− x0)‖22
ThenJD ∼ χ2(m+ p− n, c)
The functional at optimum follows a non central χ2 distribution
National Science Foundation: Division of Computational Mathematics 57 / 49
Motivation Statistical Results for Least Squares Newton algorithm Large Scale Problems Stopping the EM Algorithm Statistically Edge Detection for PSF Estimation
A further result when A is not of full column rank
The Rank Deficient SolutionSuppose A is not full column rank. Then the filtered solution can be written in terms of theGSVD
xFILT(λ) =
pXi=p+1−r
γ2i
υi(γ2i + λ2)
sixi +nX
i=p+1
sixi =
pXi=1
fiυisixi +
nXi=p+1
sixi.
Here fi = 0, i = 1 : p− r, fi = γ2i /(γ
2i + λ2), i = p− r + 1 : p. This yields
J(xFILT(λ)) ∼ χ2(m− n+ r, c)
notice degrees of freedom are reduced.
National Science Foundation: Division of Computational Mathematics 58 / 49