bayesian analysis of the effects of lower precision
TRANSCRIPT
0.
Bayesian Analysis of the Effects of Lower PrecisionArithmetic in Inverse Problems
Daniela Calvetti
based on work with D Devathi and E Somersalo
Case Western Reserve UniversityDepartment of Mathematics, Applied Mathematics and Statistics
SIAM CSE 2019
Spokane, February 27, 2019
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 1 / 23
0.
Introduction
Inverse Problems: Estimate a variable x ∈ Rn from noisy indirect observations,
b = f (x) + ε, f : Rn → Rm forward map.
Bayesian Formulation: x , b and ε realizations of random variables,
X ∼ πx , E ∼ πε, B ∼ πb|x( · | x).
Posterior distribution = solution of the inverse problem. Bayes’ formula:
πx|b ∝ πx(x)πb|x(b | x) = πx(x)πε(b − f (x)).
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 2 / 23
0.
Introduction
Exploration of the posterior by sampling: Generate a representative sample
S {x (1), x (2), . . . , x (N)}, x (j) ∼ πx|b( · | b),
using, e.g., Markov Chain Monte Carlo (MCMC).A sample provides means to estimate the posterior mean and covariance (UQ):
x ≈ 1
N
N∑j=1
x (j), C ≈ 1
N
N∑j=1
(x (j) − x)(x (j) − x)T.
Observations:
Sample generation can be time consuming for costly forward model.
Lower precision arithmetic (LPA) can reduce time for sample generation.
It is not immediately clear how LPA affect the UQ estimates.
We could use UQ tools to explore the effect of LPA.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 3 / 23
0.
Sampling from the posterior
Model problem: Consider an IP with linear forward map,
b = Ax + ε, A ∈ Rm×n,
Gaussian prior and Gaussian likelihood,
πx ∼ N (0, Γ), πε ∼ N (0,Σ).
Then the posterior is Gaussian,
πx|b ∼ N (x ,C),
whereC = (ATC−1A + Γ−1︸ ︷︷ ︸
P
)−1, x = CATΣ−1b,
P = posterior precision matrix.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 4 / 23
0.
Sampling from the posterior
If a symmetric factorization of the posterior precision is available
P = KTK,
one can use the following sampling scheme. Let
x (j) = x + y (j),
where y (j) solvesKy (j) = w (j), w (j) ∼ N (0, In).
The approach may be of little use, if
n is very large,
the matrix A is not known explicitly.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 5 / 23
0.
Whitening
Consider an alternative sampler for large and matrix free problems
requiring only evaluations of products with A or AT: (A, x) 7→ Ax ,
reducible to solving a few least squares problems.
Assume thatΓ−1 = LTL, Σ−1 = STS.
Whitening (Mahalanobis transformation):
A = SAL−1, b = Sb.
Whitened model: If z = Lx , we have
b = Az + ε,
whereπz ∼ N (0, In), ε ∼ N (0, Im).
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 6 / 23
0.
Effective sampler
Precision and covariance of the whitened problem:
P = ATA + In, C = P−1
LemmaLet z be a random variable solving the equation
Pz = ATb + w , (1)
where b ∈ Rm is an observed realization of the whitened data, and w is aGaussian random variable,
w ∼ N (0, P).
Thenz ∼ N (z , C), z = CATb,
which is the posterior distribution of the whitened problem.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 7 / 23
0.
Effective sampler
Sampling:
1 Generate wk ∼ N (0, P),
2 Solve Pzk = ATb + wk ,
3 Solve Lxk = zk .
Step 1: Generation of wk is straightforward:
wk = ATνk + ηk , ηk ∼ N (0, In), νk ∼ N (0, Im) independent.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 8 / 23
0.
Effective sampler
Step 2: The computation of zk reduces to solving a few least squares problems.In fact
Pzk = ATb + wk
= ATb + ATνk + ηk ,
1 Writeηk = ATδk + hk ∈ R(AT)⊕N (A).
where δk satisfies
AATδk = Aηk . (2)
2 Expresszk = zk1 + hk ,
where zk1 satisfies
(ATA + I)zk1 = AT(b + νk + δk). (3)
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 9 / 23
0.
Effective sampler
Because (2) and (3) are the normal equations for the least squares problems
ATδk = ηk , (4)
and [AIn
]zk1 =
[b + νk + δk
0
]. (5)
δk and zk1 can be computed by Krylov subspace iterative solvers (e.g., CGLS),requiring only
(A, z) 7→ Az , (AT, y) 7→ ATy . (6)
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 10 / 23
0.
Fast Sampling via CGLS
An alternative to iteratively solving (5) to full convergence is to approximate itssolution with
xk0 = argmin{‖y − Ax‖ | x ∈ Kk(A, y)},
where y = b + νk + δk and k0 is chosen so that
G (xk0) = min{G (xk) | k = 1, 2, . . .},
withG (x) = ‖y − Ax‖2 + ‖x‖2.
Calvetti D, Devathi D and Somersalo E (2019) Posterior Sampling with Priorconditioned CGLS
for Underdetermined Ill-posed Problems. Manuscript.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 11 / 23
0.
Sampling with Lower Precision Arithmetic
Consider two possible scenarios:
1 Matrix-vector products involving A (6) are computed in LPA;
2 Matrix-vector products involving A and the Krylov least squares solvers areimplemented in LPA.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 12 / 23
0.
Example: Deconvolution problem
Consider a deconvolution problems with Airy kernel and Gaussian noise
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
0
0.01
0.02
0.03
0.04
FWHM
0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
0
0.01
0.02
0.03
0.04
FWHM
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 13 / 23
0.
Sparse data
Reference
0 0.2 0.4 0.6 0.8 1
-1
-0.5
0
0.5
1
1.5
2
2.5Double precision
0 0.2 0.4 0.6 0.8 1
-1
-0.5
0
0.5
1
1.5
2
2.5
Single precision MatVec A
0 0.2 0.4 0.6 0.8 1
-1
-0.5
0
0.5
1
1.5
2
2.5All single precision
0 0.2 0.4 0.6 0.8 1
-0.5
0
0.5
1
1.5
2
2.5
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 14 / 23
0.
Normality: QQ-plots
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1Double precision
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1Single precision
Note: The sample corresponding to ’all single’ is not Gaussian at all!
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 15 / 23
0.
Eigenvalues of covariance matrices
0 200 400 600 800 100010
-4
10-3
10-2
10-1
100
reference
double
0 200 400 600 800 100010
-4
10-3
10-2
10-1
100
101
reference
single
0 200 400 600 800 100010
-30
10-20
10-10
100
1010
reference
single
all single
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 16 / 23
0.
Scatter analysis
Given two samples
S1 = {x11 , . . . , xN1 } and S2 = {x12 , . . . , xN2 },
find a few orthogonal directions, v1, . . . , vk ∈ Rn along which the difference in thespread of the samples is maximized.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 17 / 23
0.
Scatter analysis
Sample covariance matrices:
Ck =1
N
N∑j=1
(x jk − xk)(x jk − xk)T, k = 1, 2.
where
xk =1
N
N∑j=1
x jk .
Problem: Find q ∈ Rn maximizing (or minimizing)
H(q) =qTC1q
qTC2q.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 18 / 23
0.
Scatter analysis
SinceH(αq) = H(q) for all α ∈ R,
it is possible to scale q so that
qTC2q = 1,
yielding the constrained optimization problem:
q = argmax{qTC1q}subject to qTC2q = 1.
Equivalently, we may solve the generalized eigenvalue problem:
C1q = λC2q.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 19 / 23
0.
Scatter analysis
We remark that:
1 All generalized eigenvalues are real
2 The eigenvectors are not orthogonal
3 The maximizer is the eigenvector associated with the largest generalizedeigenvalue.
To analyze the samples, we project them in the subspaces determined by thegeneralized eigenvectors corresponding to k largest eigenvalues.
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 20 / 23
0.
Scatter analysis: Reference vs Double
The reference sample is the once computed in a canonical way, via the Cholesky ofthe precision of the posterior, i.e., C2 = Cref .
-0.5 0 0.5
-0.6
-0.4
-0.2
0
0.2
0.4
0.6 double
reference
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 21 / 23
0.
Scatter analysis: Reference vs Single
-1 0 1 2-1.5
-1
-0.5
0
0.5
1
1.5single
reference
Daniela Calvetti (CWRU) Bayesian Inverse Problems Spokane, February 27, 2019 22 / 23