ter for electromagnetics researc
TRANSCRIPT
Preconditioners for Structured Matrices Arising in
Subsurface Object Detection
Misha Kilmery Eric Millery Carey Rappaporty
Center for Electromagnetics Research
Electrical and Computer Engineering Department
Northeastern University
Boston, MA 02115
email: [email protected], [email protected], [email protected]
KEYWORDS: preconditioner, Helmholtz equation, iterative methods, QMR, sine transform,
scattering, parallel
AMS Subject Classications: 65F10, 65F15, 65N22, 78-08, 78A45, 78A40
This work supported by the Army Research Oce Demining MURI under Grant DAAG55-97-1-0013
1
RUNNING HEAD: Preconditioners for Matrices in Object Detection
CONTACT INFORMATION:
name: Dr. Misha E. Kilmer
mailing:
235 Forsyth Bldg.
Northeastern University
Boston, MA 02115
fax: 617-373-8627
email: [email protected]
2
Abstract
In this paper we focus on computationally ecient methods for solving the 2-D Helmholtz-
type equation with piecewise constant, complex wavenumber. To approximate the solution to
this equation, we replace the innite boundaries with a perfectly matched layer (PML) and
discretize the continuous problem in space using nite dierences. The corresponding matrix
is structured, large, sparse, complex, and is neither Hermitian nor symmetric. We examine
preconditioners for use with iterative Krylov subspace methods in solving the corresponding
discrete systems. Our preconditioners are nearly Toeplitz block approximations to the original
matrices. The preconditioners can be applied eciently, and in parallel, at each iteration using
1-D sine transforms and band solves. We show that the eigenvalues of the preconditioned
matrix can be computed once the eigenvalues of certain 1-D matrices are known; therefore we
are able to compute eigenvalues for rather large 2-D problems. We present results illustrating
the eectiveness of the preconditioners under various conditions.
1 Introduction
In this work we are interested in developing an accurate and highly ecient forward scattering
model ; that is, a computational code which determines observed scattered elds from a hypothesized
distribution of subsurface, or underground, scatterers. Ultimately, one would like to use the forward
scattering data generated in this way to design inversion algorithms aimed at detecting and localizing
buried objects, such as landmines.
The mathematical formulation of the forward scattering problem of interest here is given by
a Helmholtz-type equation, derived from the time harmonic form of Maxwell's Equations [13, pg
6,12]. In particular, if we assume a 2-D problem with the subsurface illuminated by a plane wave
impinging at an angle from the normal, then it is simple to show that the partial dierential
3
equation describing the scattering problem is given by
( + k2(x; y))e(x; y) = m(x; y)e0(x; y) (1)
where k2(x; y) = !20(x; y) is the square of the wave number, with ! representing angular frequency
and 0 a constant denoting the magnetic permeability. Here, e(x; y) denotes the scattered electric
eld and e0(x; y) the eld in the absence of a scatterer, which is known. The function (x; y),
called the electrical permittivity, is related to the electrical properties of the media while m(x; y)
describes the properties of the buried object and has support only over the region in which the
object is located. In our problem, we will assume that the wavenumber is piecewise constant; that
is,
k(x; y) =
8>>>>>><>>>>>>:
k0 (x; y) 2 air
ks (x; y) 2 soil
km (x; y) 2 subsurface object
:
Under this notation, m(x; y) = (k2s k2m) if (x; y) is in the object and 0 otherwise.
The computational problem of interest in this paper is the stable and rapid solution of a discrete
version of (1). Although there are also integral equation approaches to solving the forward scattering
problem, the approach of solving the nite dierence discretized partial dierential equation (PDE)
is more appealing to us for several reasons, not the least of which is its exibility. For instance, we can
readily incorporate a rough air-soil interface, volume inhomogeneities, and complex target geometry
into the model, and we have the added ability to easily and eectively deal with a space-varying
wavenumber. Further, the PDE is easily discretized with nite dierences, and the forward solver,
namely the preconditioned iterative method described herein, is straightforward to understand,
reasonably easy to implement, and fast.
Now (1) represents a scattering problem over an innite domain, but we are interested in the
solution over the rectangular region = [a1; a1][b1; b1] for some predened real values a1; b1 > 0.
4
Thus, we begin by replacing the innite boundaries with a perfectly matched layer (PML)[1, 11]
whose mathematical formulation we describe in x2. We discretize the continuous problem in space
using nite dierences. This leads to a matrix equation of the form
Af = g
where the matrix A is an n n, sparse, complex, structured matrix that is neither symmetric nor
Hermitian. The entries in A and right hand side g depend upon frequency and both A and g are
sparse. The values of g also depend on incident angle.
Typically, the matrix has some eigenvalues clustered around 0 whose real parts are both positive
and negative. Hence, direct solution methods like Gaussian elimination will require partial pivoting
for stability. Even without pivoting, the number of ops and the amount of storage needed to
compute a solution directly can be prohibitive. In our applications, n will be quite large. However,
since the matrix is structured and sparse, the system is a good candidate for solution via iterative
methods. Due to the eigenstructure of the matrix, an eective preconditioner | that is, one which
can be applied at each iteration for about the same cost as a matrix-vector product and which
clusters eigenvalues away from zero | must be used.
In this paper, we examine preconditioners for use with iterative Krylov subspace methods for
solving the resulting linear systems. Preconditioning techniques derived from fast direct methods
for the single medium case with Sommerfeld-like boundary condition have been explored for 2-D [4]
and 3-D [3] cases. Our preconditioner is somewhat similar to a domain decomposition based precon-
ditioning idea in [4]. However, the PML boundary condition we employ here is signicantly dierent
from the Sommerfeld-like boundary condition assumed in [3, 4] and requires special treatment. Our
preconditioner approximates A with a matrix that can be treated by fast direct methods, but unlike
most of the approaches in [3, 4], our development does not require that we replace the PML with
another boundary condition. Also, since A is neither complex symmetric nor Hermitian, neither is
5
our preconditioner. A further dierence from the methods in [3, 4] is that we consider piecewise
constant wavenumber.
Despite the additional diculties presented by incorporating the PML and piecewise constant
wavenumber into our model, we are able to use fast direct methods to apply our preconditioner by
exploiting the special structure of A. The preconditioner is a nearly Toeplitz block approximation
to the original matrix and can be applied eciently and in parallel at each iteration using 1-D sine
transforms and band solves. Moreover, we take a similar approach to that in [2] and show that the
eigenvalues of the 2-D preconditioned matrix can be obtained by solving a few 1-D eigenproblems. In
this way, we are able to develop intuition on the clustering of the eigenvalues of our preconditioned
matrix, and in turn, on the rate of convergence of the iterative method.
Our paper is organized as follows. In x 2, we discuss the formulation and structure of the matrix.
In x 3 we show how this structure can be exploited to form preconditioners which can be applied
eciently using fast sine transforms and 1-D solves. The eigenanalysis of our preconditioned matrix
is given in x 4 and we present numerical results in x 5. Conclusions and future work are the subject
of x 6.
2 Formulation and Structure of A
In this section, we describe how the matrix is formed using nite dierences and how a particular
ordering of the unknowns gives rise to a matrix with special structure. We begin by introducing the
PML boundary conditions.
2.1 PML
As mentioned in the introduction, we are interested in the solution to (1) on a rectangular region
= [a1; a1] [b1; b1]. The PML is an absorbing boundary condition in the sense that inside
6
the PML, waves are forced to attenuate in the PML outward from @ with a minimum amount of
re ection from the -PML interface, so that the solution to the nite dimensional problem resembles
the solution to the original innite space problem in the region .
In order to mathematically dene the PML, we introduce some notation. The complex-valued
quantity is
(x; y) = 0rel(x; y) + i(x; y)
!;
for some real 0; rel 1 with i =p1 and 0 a constant (the permittivity of free-space). The
value is the conductivity of the material. In air, = 0, while in soil, is usually complex valued.
High values of mean that the wave is rapidly attenuated in that medium.
The basic idea behind the perfectly matched layer approach is to place an articial layer around
computational grid in which the conductivity is strictly increasing from @ outward to the edge of
the PML (see Figure 1 for an illustration). By gradually increasing away from the inner boundary
@, the wave is forced to attenuate with a minimum amount of articial re ection o the -PML
interface.
Formally, the equation being solved inside the PML is as follows [11]:
c(x)@
@x
c(x)
@e
@x
+ c(y)
@
@y
c(y)
@e
@y
+ k2(x; y)e = 0 (2)
where k takes the value of wave number of the medium which it surrounds and
c(t) =
8>><>>:
0!0!+i(t)
t is the normal direction
1 otherwise
:
The key to the success of the PML at attenuating the wave while introducing nominal re ection is
a prudent choice of (t). We take
(t) = f
jtj
hN
p;
where is a1 or b1, depending on whether the normal direction t is x or y, N is the number of
desired layers in the PML and h is the mesh width and f and p are user dened parameters. The
7
interested reader is referred to [10] for appropriate choices of these parameters. Note that the width
of the PML in meters, say , is dened in terms of h and N ; that is, = hN . Likewise, we have a
dierent function c(t) depending on our choice of h and N .
Because the wave has theoretically attenuated in the PML, the nal outer boundary condition
is a Dirichlet condition:
e = 0 on @;
where is the rectangular region = (a1 ; a1 + ) (b1 ; b1 + ).
2.2 Discretization
The problem that we would like to discretize is
+ k2(x; y)
e(x; y) = m(x; y)e0(x; y) (x; y) 2
c2(x) @2
@x2+ c2(y) @2
@y2+ c(x) @c
@x@@x
+ c(y) @c@y
@@y
+ k2(x; y)e = 0 (x; y) 2 n
e = 0 @
: (3)
We illustrate this in Figure 1 .
In the interior, we use the standard 5-point dierence operator on a uniform mesh with grid
spacing h in both the x and y directions. Over the PML, we use standard centered dierences
for both the rst and second order derivatives of e in the x and y directions, and the coecients
c0(x); c(x); c0(y); c(y) are evaluated at the corresponding midpoints in the x and y directions, respec-
tively.
To obtain the problem in matrix form, we will order the unknowns lexicographically, one row
(left to right) at a time, bottom to top (refer to Figure 1). For ease of exposition, we will take
(hence ) to be a square so that the number of grid points in both the x and y directions is the
same and equal to mw on . Hence, n = m2w . We will adjust above so that the number of grid
points across the width of the PML is N . Note that if n1 denotes the number of grid points in
8
then mw = n1 + 2N . We will let na and ns denote the number of grid points in the y-direction in
the air and in the soil, respectively, so that for square we have na + ns = n1. Finally, we use nr
to denote the number of grid points on the object in the y direction and nc to denote the number of
grid points on the object in the x direction. Therefore, the total number of unknowns corresponding
to the object is nrnc. Throughout the remainder of the paper, we assume nr ; nc n1; such is the
case, for instance, in buried landmine detection. We summarize this notation in Table I.
Discretized in this way, we observe that A can be written as a tensor sum:
A = ((I A1 +H I) + E)
where = (k2m k2s), A1;H are tridiagonal matrices we will describe shortly, I denotes the identity
matrix of size mw, and E is a rank nrnc diagonal matrix which contains a 1 in that position on the
diagonal if that position corresponds to an unknown in the buried object and a 0 otherwise.
Now let T denote the n1 n1 tridiagonal Toeplitz matrix tridiag[1;2; 1], and let ej denote the
jth unit vector. We will use Pr to denote the r r matrix with ones on the main anti-diagonal and
zeros otherwise; notice that Pr = P Tr . Let xi = a1 + ih; i = 1 : N and denote ci = c(xi); c0i = c0(xi).
Finally, set si = c2i :5hcic0i and si = c2i + :5hcic0i. Then we can write the mw mw matrix A1 as
A1 =
26666664
A1;1 s1RT1 0
R1 T Pn1R1PN
0 s1PNRT1 Pn1 PNA1;1PN
37777775
where the n1 N matrix R1 is
R1 = [0; 0; : : : ; 0; e1; ]
9
and the N N matrix A1;1 is
A1;1 =
2666666666666664
2c2N sN 0
sN1 2c2N1 sN1 0
0.. .
. . .. . . 0
0 s2 2c22 s2
0 s1 2c21
3777777777777775
:
Observe that the entries of PNA1;1PN are just the entries of A1;1 in reverse order.
Further, the matrix H is given by
H = A1 +D
where D is a diagonal matrix whose rst N + ns entries are k2sh2 and whose last N + na entries are
k2ah2.
Note that because of the PML, the matrix will not be symmetric or Hermitian.
3 The Preconditioner
Observe that if N = 0, A1 would be a symmetric, tridiagonal Toeplitz matrix. In that case, the
system could be solved directly in an ecient manner by using fast 1-D sine transforms to decouple
the system followed by solving mw, 1-D tridiagonal systems. However, appropriate values of N are
greater than or equal to 8 for the types of problems we are solving [10], so A does not quite have
Toeplitz blocks. Fortunately, the near-Toeplitz structure of the blocks can be exploited to develop
a preconditioner, described below, which can be applied eciently and in parallel at each iteration.
Initially, we tried various preconditioners derived algebraically by replacing A1 with approximations
which could be diagonalized by fast orthogonal transforms; for example, the tridiagonal matrix
obtained by averaging along the diagonals of A1 is symmetric and Toeplitz and can be diagonalized
10
by a fast discrete sine transform (DST). Unfortunately, we found that these types of preconditioners
were not nearly as eective as the one we now describe.
Our idea is similar to the domain decomposition approach described in Section 3.3.1 of [4] for
the single-medium Helmholtz problem, but it varies in that we must deal with the PML rather than
the Sommerfeld-like boundary as well as with a varying wavenumber k.
The key is to isolate the symmetric, Toeplitz block portion of the matrix so that we may use
fast transforms to decouple the system further. To do this we will reorder the matrix so that the
unknowns in the right and left portion of the PML are ordered last. Under this reordering scheme,
A will have the following form:
A =
2664
T + E B
s1BT G
3775
where T = (Imw T+H In), T = tridiag(1;2; 1). Also,
G =
2664
G1 0
0 G2
3775 ; G1 = Imw
A1;1 +H IN ; G2 = Imw PNA1;1PN +H IN
and
B = [ImwR1; Imw
R2]; with R2 = Pn1R1PN :
Note that E is just the reordered version of E with the last 2Nmw zero rows truncated.
Thus B is rank 2mw matrix that contains the connections between unknowns in the right and
left PML and the unknowns on the left and right boundary of .
We will be interested in solving the following right preconditioned system:
AM1y = g; with y = Mf:
We choose to precondition on the right, rather than the left, simply because the residual for the left
preconditioned system is the same as the residual for the unpreconditioned system. We dene our
11
preconditioner as
M =
2664T B
0 G
3775 :
Hence, solving systems involving the preconditioner can be accomplished according to the following
algorithm:
Algorithm 1: Solving Mv = z
1. Partition v and z into vectors with lengths mwn1 and 2Nmw,
respectively:
v = [v(1); v(2)]T ; z = [z(1); z(2)]T .
2. Solve Gv(2) = z(2).
3. Solve T v(1) = z(1) Bv(2) y.
Note that T = SDS where S is the normalized discrete sine transform matrix of size n1
Skj =
r2
n1sin
kj
n1 + 1
and D is a diagonal matrix with known entries
dj = 2 + 2 cos(j=(n1 + 1)); j = 1; : : : ; n1:
Dene Q to be the matrix of size mwn1 which reorders the unknowns as 1;mw+1; 2mw+1; : : : ; (n1
1)mw + 1; 2;mw + 2; : : : ; (n1 1)mw + 2; : : : ; n1mw. Using the structure of T , step 3 is therefore
equivalent to solving the mw tridiagonal problems
(djI +H)~v(1)j = ~yj
where ~y = Q(IS)y, v(1) = (IS)QT ~v(1), and the subscript on ~v(1) and ~y denote the jth subvector
of the respective vector when it is sequentially partitioned into n1 subvectors of length mw.
12
Implemented in this way, the cost of solving a system with M is the cost of solving two 1-
D problems in step 2, each of size Nmw, (solving with G means solving two uncoupled systems
involving G1 and G2 in step 2 above) and each having bandwidth N in addition to the cost of mw,
1-D DST's followed by mw tridiagonal solves for a total of O(mwn1 lg(n1) + Nmw) operations. To
this end, we can prefactor G1 (hence, by denition, G2) and the tridiagonal matrices djI+H so that
the solution of the 1-D systems only requires forward and backward substitutions at each iteration.
Observe that our preconditioner M and the matrix A dier by a matrix of rank 2mw+nrnc n.
Note also that since our matrix was neither Hermitian nor symmetric, the preconditioner need not
have that structure.
4 Eigenanalysis
The iterative method which we employ in our examples is the coupled two-term recurrence version
of QMR (quasi-minimal residual) [7] without lookahead. Freund and Nachtigal [6] give the following
error bound on the kth residual when QMR is applied to the system AM1y = g:
Theorem 1 ([7]) Let Hm be the m m matrix generated by the unsymmetric Lanczos algorithm
after m steps, and assume that Hm is diagonalizable. Then for k = 1; 2; : : : ;m 1 the residual
vectors of the QMR algorithm satisfy
krkk2 kr0k2(Hk)"kpk + 1 (4)
where
"k = minp(0)=1
max2(AM1)
jp()j
and (Z) denotes the set of eigenvalues of a matrix Z, () denotes the condition number with respect
to the 2-norm, and p is a polynomial of degree k.
13
In particular, since our preconditioner and matrix dier by a matrix of rank 2mw + nrnc < n, it
is easy to show that at least n2mwnrnc eigenvalues of the preconditioned matrix are identically
one and therefore the theorem says that in exact arithmetic, QMR must terminate after at most
2mw + nrnc iterations. Therefore, to understand what is happening in the rst few iterations, we
must focus on characterizing the non-unit eigenvalues of the preconditioned matrix.
The approach we use in this section is based on that of [2], where the authors are interested
in the case when A comes from the discretization of the Helmholtz problem with Sommerfeld-like
boundary condition in a single layer media.
Before proceeding, let us set up the notation. We will use Ij to denote the identity matrix of
size j. As before, we use () to denote the set of eigenvalues of the argument.
Theorem 2 Let p = 2mw + nrnc and note p < n. Assume H and A1;1 are diagonalizable. Then
T1(E s1BG1BT ) = XW ;
where X and W are matrices of size n1mw p and rank p.
Because the proof of Theorem 2 is constructive and somewhat tedious, we defer the proof and
the precise denition of X and W until x4.1 and turn to the main result of this section.
Theorem 3 The matrix AM1 has at least n p eigenvalues which are identically 1. Further, the
p non-unit eigenvalues are given by 1 (W X) where X and W are the matrices from Theorem 2.
Proof: By a similarity transform, the eigenvalues of AM1 are the eigenvalues of M1A. Now
it can readily be checked that the matrix M1 is given by
M1 =
2664T1 T1BG1
0 G1
3775 :
14
Therefore
M1A =
2664In1mw
T1(s1BG1BT + E) 0
s1G1BT I2Nmw
3775 :
Since M1A is block lower triangular, it follows that the eigenvalues of M1A are the union of the
eigenvalues of the blocks on the diagonal [8, Lemma 7.1.1]. Therefore, the set of eigenvalues ofM1A
must contain at least 2Nmw ones plus the eigenvalues of In1mwT1(s1BG
1BT+E). The eigenval-
ues of the latter are just 1(T1(s1BG1BT +E)). But by Theorem 2, (T1(s1BG
1BT+E)) =
(XW ): Finally, by Lemma 1 of [2], the non-zero eigenvalues of XW are precisely the p eigenval-
ues of W X. It follows that M1A has n p unit eigenvalues and p non-unit eigenvalues given by
1 (W X). 2
The import of Theorem 3 is that if W X has a simple form, it becomes possible to compute
the eigenvalues of the fully 2-D preconditioned matrix in terms of a problem of size p, provided the
eigenvalues of the N N matrix A1;1 and the mw mw matrix H are known or can be computed.
As we show in the next section, W X does indeed have a compact representation from which we
can compute those p eigenvalues.
4.1 Dening X and W
In order to dene X and W we proceed in three steps. Let us assume that H and A1;1 are diagonal-
izable with H = UU1 and A1;1 = FF1. Further, let fN denote the N th row of F and let ~fN
denote the N th column of F1. Let the jth component of fN (resp. ~fN ) be given by fN (j) (resp.
~fN (j)). The rst step is the proof of the following lemma:
Lemma 1 The matrix BG1BT can be written
(U In1)( [e1; en1 ])(Imw [eT1 ; e
Tn1])(U1 In1); (5)
15
where is a diagonal matrix of size mw whose jth diagonal element is
NXi=1
fN (i) ~fN (i)
i + j:
Proof: We will let I denote Imwunless otherwise specied. Recalling the denitions in x 3,
BG1BT = [I R1; I R2]
2664G11 0
0 G12
37752664
I RT1
I RT2
3775
= (I R1)G11 (I RT
1 ) + (I R2)G11 (I RT
2 ): (6)
Now using the eigendecomposition of H and the relation of G1 to G2 specied in x3, it follows that
G11 = (U IN )(Imw
A1;1 + IN )1(U1 IN )
G12 = (U PN )(I A1;1 + I)1(U1 PT
N ):
Substituting the above equations into (6) and using the relation R2PN = Pn1R1 and the eigende-
composition of A1;1 we have
BG1BT = (U In1)(I R1F )(I + I)1(I F1RT
1 )+
(I PR1F )(I + I)(I F1RT1 P
T )(U1 In1)
= (U In1)Z(U1 In1): (7)
Now clearly R1 = e1eTN where the rst unit vector is understood to have length n1 and eN is the
N -length unit vector with a 1 in the N th position. Thus, R1F = e1fN and F1eN eT1 = ~fN eT1 . By
brute force one can readily show that the rst term in Z, (I e1fN )(I + I)1(I ~fN eT1 ),
is a block diagonal matrix with n1 n1 blocks, and in the jth block, only the (1; 1) element, given
byPN
i=1fN (i) ~fN (i)i+j
, is non-zero. Similarly, the second term in Z, (I Pe1fN )(I + I)1(I ~fN e
T1 P
T ), is a block diagonal matrix with n1 n1 blocks and in the jth block, only the (n1; n1)
element, also given byPN
i=1fN (i) ~fN (i)i+j
, is non-zero.
16
Therefore,
Z = [e1; en1]
2664
eT1
eTn1
3775 = ( [e1; en1 ])(Imw
[eT1 ; eTn1])
and substituting this expression into (7), the proof is complete. 2
For the second step, we need to get an expression for E s1BG1BT . To do this, we note that
E can be written in the following tensor form
E =
26666664
0
Inr
0
37777775[0; Inr; 0]
26666664
0
Inc
0
37777775[0; Inc; 0]:
Now using (5) and this expression for E, we pull terms involving U outside of the sum to obtain
E s1BG1BT = (U In1) ~X ~W (U1 In1)
where ~X and ~W are the mwn1 (nrnc + 2mw) matrices
~X =
26666664U1
26666664
0
Inr
0
37777775
26666664
0
Inc
0
37777775;s1 [e1; en1 ]
37777775
~W =
26666664
[0; Inr ; 0]U [0; Inc; 0]
I
2664
eT1
eTn1
3775
37777775:
For the third and nal step, consider T1(U In1) ~X . Recall T = (I T +H I), so that we
have
T1(U In1) ~X = (U I)(I T + I)1 ~X:
Thus, we may dene
X (U I)(I T + I)1 ~X
17
and
W ~W (U1 In1)
and the proof of Theorem 2 is complete.
Now by Theorem 3 we are interested in the p eigenvalues ofW X. But these are the eigenvalues
of ~W (I T + I)1 ~X. Fortunately, using the fact that T = SDS, the entries of this matrix are
not dicult to compute; indeed, the matrix C ~W (I T + I)1 ~X has the structure
C =
2664C1 C2
C3 C4
3775
where C4 has dimension 2mw and is block diagonal with 2 2 blocks on the diagonal and C1 has
dimension nrnc. Specically, the sub-blocks are dened as follows. Let s1; sn1 denote the rst and
last columns of the size n1 normalized discrete sine transform matrix S. Let ~U represent the matrix
[0; Inr; 0]U , let U = U1([0; Inr; 0]T ), and put ~S = [0; Inc; 0]S. Then
C1 = ( ~U ~S)(ImwD + In1)
1(U ~S);
C2 = ( ~U ~S)(ImwD + In1)
1( [s1; sn1 ]);
C3 = (Imw
2664
sT1
sTn1
3775)(Imw
D + In1)1(U ~S):
The jth block of C4 is dened as
j
2664
Pn1i=1
s1(i)2
di+j
Pn1i=1
s1(i)sn1 (i)di+j
Pn1i=1
s1(i)sn1 (i)
di+j
Pn1i=1
sn1 (i)2
di+j
3775 :
Fortunately, upon close examination, it becomes clear that it is possible to construct the entries of
the sub-blocks in no more ops than it takes to nd the eigenvalues of H and the matrix C. Hence,
rather than directly calculating the eigenvalues of the m2wm2
w matrix AM1 at a cost of O((m2w)
3)
ops for the 2-D preconditioned matrix, we can equivalently nd the eigenvalues of C in O(m3w)
ops, assuming nr; nc are suciently small relative to n1.
18
5 Numerical Results
In this section, we give results for the performance of our preconditioner for two dierent examples.
All computations were run in Matlab using IEEE double precision oating point arithmetic. In both
examples, we will assume a rectangular landmine of dimension 5cm-by-6cm lled with TNT is buried
so that the top of the landmine is 3 cm below the surface. The center of the landmine was centered
in the grid and a plane wave was assumed to be incident at 0 degrees (refer to Figure 1); from this
information, e0 at grid points over the mine could be calculated and thus used to form the right
hand side vectors g (refer to (3)). For the landmine, at the frequencies at which we were working,
we had rel = 2:9; = :0005. We primarily took our initial guess f0 to be the vector of all zeros;
however, for comparison purposes, average iteration counts obtained by setting f0 to be a vector
with real and imaginary parts consisting of uniformly distributed random numbers in [12 ;
12 ], are
given in the tables. We stopped iterating QMR when the relative residual norm, kg Afkk2=kbk2,
where fk = M1yk, is less than 107.
For a given soil type, permittivity rel and conductivity of the soil vary with frequency in a
complicated manner beyond the scope of this paper; therefore, in this work we do not attempt to
judge the performance of the preconditioner as a function of frequency, but leave this as a topic for
future research. Moreover, we note that to be ecient, the PML conductivity prole must change
as a function of sampling rate; yet how this prole changes is not well understood. The values we
have used give good performance for h at 10-50 points per wavelength, which is sucient sampling
for the class of scattering problems in which we are interested.
5.1 Example 1
The soil we modeled in this example was Puerto Rican clay loam. Thus, for the soil rel = 6:5 and =
:019, which are the estimated values for this soil type when ! = (2)480MHz [9]. One must sample all
19
the media at a rate of at least 10 points per wavelength; that is we need h mini2air;soil;mine i=10.
We conducted two sets of tests, one with h = soil=10 and one with h = soil=20. These values for
h ensured that soil, air and the mine were all sampled at a rate of at least 10 (resp. 20) points per
wavelength, or roughly at 2.45cm and 1.23cm increments, respectively. We used 8 layers of PML,
so that N = 8. We then calculated results for 4 dierent grid sizes mw, mw = 2q 1 + 2N for
q = 6; 7; 8; 9. The convergence curves for QMR using our preconditioner are given in Figures 2 and
3. The non-unit eigenvalues of AM1 were calculated using the method described in x4, and are
displayed in Figures 4 and 5.
In Table II for f0 = 0 we summarize the convergence results for both values of h and all four mesh
sizes. For comparison, the average number of iterations for convergence when f0 was initialized as
random are also given in the table. The averages were computed from iteration counts obtained for
ve dierent initial guesses, and in most cases all counts were within two of the average. Note from
the table and the gures that with f0 = 0, the convergence behavior does not appear to deteriorate
much with an increase in mesh size, even though the rank of I AM1 (refer to x4) is rapidly
increasing. The convergence behavior for f0 = 0 is also little aected by a decrease in h. The
numbers in the table indicate that if the initial guess is random, convergence behavior is somewhat
more sensitive to changes in h and mesh size, yet the number of iterations are still much smaller
than the rank of I AM1. Choosing f0 = 0 consistently resulted in signicantly fewer iterations:
we believe this behavior is related to the underlying smoothness and structure of the solution.
In light of Theorem 1, a look at the distribution of the eigenvalues as q varies helps explain this
behavior. We observe that a majority of the non-unit eigenvalues displayed in the gures have real
part clustered between .6 and 1 and imaginary part nearly zero while the relatively small number
of the remaining eigenvalues tend to lie on or near smooth curves in the plane. For example, for
q = 6 at 20 points per wavelength, there are 46 eigenvalues with real part < :6 out of 194 total
20
non-unit eigenvalues; for q = 7; 8; 9 there are 66; 111, and 199, respectively, out of 322; 578; and 1090
non-unit eigenvalues. Thus, a decreasing percentage of the non-unit eigenvalues fall outside this
range. Moreover, those outside the range also tend to cluster (i.e. Figure 5 for large q those with
imaginary part around .5 and real part around .35). Also, not obvious from the pictures, many of
the non-unit eigenvalues appear to have algebraic multiplicity 2.
From Theorem 1, one can obtain fast convergence if there is a low degree polynomial with
p(0) = 1 which has a small maximum absolute value when evaluated over all the eigenvalues of the
preconditioned matrix. By the preceding discussion, for each value of q, one can easily imagine a low
degree polynomial, say with roots taken as the distinct \outlying" eigenvalues plus a few of those
eigenvalues with near-zero imaginary part and real part in the range .6 to 1.1, which is small over
all the eigenvalues of AM1. Hence, the "(k) term will be small in (4). As an example, consider the
eigenvalues for the case with q = 9 and sampling at 20 points per wavelength, given by the '+'s in
Figure 6, and let the circles in the gure be the roots of a k = 56 degree polynomial with p(1) = 0.
The maximum over all the eigenvalues of AM1 is then 1:22105, indicating from the theorem we
expect the residual to be fairly small after 56 iterations even though there are more than a quarter of
a million unknowns. Clearly, by changing the roots, this upper bound could be decreased; however,
this particular choice illustrates our point.
5.2 Example 2
In this example, we use a dierent soil type (referred to as \Seabee" in the literature [12]) with
rel = 21:3078 and = :2273 at a slightly lower frequency, ! = (2)475MHz. We calculated results
for two sets of experiments, one with h = 1:34cm, or soil=10, and the other with h = :671cm, or
soil=20. Our computations were done for the same 4 dierent grid sizes as in the previous example.
A comparison of the convergence curves for the four dierent grid sizes for the larger h is given in
21
Figure 7 and the corresponding non-unit eigenvalues are displayed in Figure 9.
The convergence behavior for this example is summarized in Table III. Again, we observe that
increasing the grid size does not appear to deteriorate the performance of the preconditioner when
f0 = 0; when f0 was randomly chosen, there was some sensitivity. Additionally, decreasing h by
half seems to have only a very mild aect on the rate of convergence, with the aect being slightly
more pronounced in the case of a random starting guess. As before, this convergence behavior can
be analyzed by looking at the non-unit eigenvalues for this example. In other words, we again see
that a majority of the non-unit eigenvalues for each value of q for both sampling rates are clustered
between .6 and 1.1 on the real axis and 0 on the imaginary axis. There are other smaller clusters
of eigenvalues away from (0; 0), and everything else lies on or near a continuous curve in the plane.
Therefore, as in Example 1, we deduce that a low-degree polynomial with a judicious choice of
roots taken from among those in the gure(s) can be found which has small maximum magnitude
over all the eigenvalues; hence the bound in Theorem 1 will be small in many fewer iterations than
2mw + nrnc, the rank of I AM1.
6 Conclusions and Future Work
We developed an eective preconditioner for solving the Helmholtz-type problem with piecewise-
constant complex wavenumber with PML boundary on a rectangular grid. Additionally, we gave an
ecient method for applying the preconditioner based on 1-D discrete sine transforms and 1-D band-
solvers. For the case of a square grid, the cost of applying our preconditioner is O(m2w lg(mw + 1))
when mw +1 is a power of 2. We showed that our preconditioned matrix is a low rank perturbation
on the identity, say k, guaranteeing convergence in as many iterations, in exact arithmetic. We
illustrated a technique for determining the k non-unit eigenvalues of the preconditioned matrix
based on solving a few 1-D eigenproblems, thereby making it a computationally feasible problem to
22
analyze the convergence behavior of the 2-D problem. We applied the technique to our examples
and found a signicant number of those eigenvalues are still clustered in such a way as to ensure
convergence in even fewer iterations than the theorem guarantees. Indeed the behavior anticipated
by the eigenanalysis was illustrated in our convergence results.
We note that our preconditioner is also ecient for iterative methods such as BLQMR (block
QMR) [5] for solving 2-D problems involving multiple right-hand-sides, arising when multiple in-
cidence angles are used. In the future, we plan to examine preconditioners for the 3-D forward
scattering problem, also derived from the time harmonic form of Maxwell's equations, but which
has a more complicated mathematical formulation than a 3-D Helmholtz-type problem.
Acknowledgments. We wish to thank Howard Elman and Dianne O'Leary for their helpful
comments on an early draft of this paper.
References
[1] J. Berenger, A perfectly matched layer for the absorption of electromagnetic waves, J. Math.
Phys., 114 (1994), pp. 185200.
[2] H. Elman and D. O'Leary, Eigenanalysis of some preconditioned Helmholtz problems, Nu-
mer. Math. to appear.
[3] , Ecient iterative solution of the three-dimensional Helmholtz equation, J. Comp. Phys.,
142 (1998), pp. 163181.
[4] O. Ernst and G. Golub, A domain decomposition approach to solving the Helmholtz equation
with a radiation boundary condition, Contemporary Mathematics, 157 (1994), pp. 177192.
[5] R. Freund and M. Malhotra, A block-QMR algorithm for non-Hermitian linear systems
with multiple right hand sides, Linear Algebra Appl., 254 (1997), pp. 197257.
23
[6] R. Freund and N. Nachtigal, QMR: A quasi minimal residual method for non-hermitian
linear systems, Numer. Math., 60 (1991), pp. 315339.
[7] , An implementation of the QMR method based on coupled two-term recurrences, SIAM J.
Sci. Comput., 15 (1994), p. 313.
[8] G. Golub and C. V. Loan, Matrix Computations, Johns Hopkins Press, 1989. second ed.
[9] J. Hipp, Soil electromagnetic parameters as functions of frequency, soil density, and soil mois-
ture, Proceedings of the IEEE, 62 (1974), pp. 98103.
[10] E. Marengo, C. Rappaport, and E. Miller, Optimum PML ABC conductivity prole in
FDFD, IEEE Trans. on Magn., (1999). to appear.
[11] C. Rappaport, Interpreting and improving the pml absorbing boundary condition using
anisotropic lossy mapping of space, IEEE Trans. Magn., 32 (1996), pp. 968974.
[12] E. M. Rosen and T. W. Altshuler, Analysis of uxo and clutter signatures from the DARPA
background clutter experiment, in UXO Forum '98, May 1998.
[13] D. H. Staelin, A. Morgenthaler, and J. A. Kong, Electromagnetic Waves, Prentice
Hall, 1994.
24
+x-x
-y+
y
Ω
Ο
a1
γ
air PML
c(y)
=1
e=0
c(y)
=1
c(x)=1
c(x)=1
c(x),c(y)non-constant
b1
soil PML
Figure 1: Illustration of problem setup. denotes the whole of the larger square, while denotes
the region inside the inner square only.
25
na no. gridpoints in vertical direction in air
ns no. gridpoints in vertical direction in soil
nr no. gridpoints in vertical direction of buried object
nc no. gridpoints in horizontal direction of buried object
n1 no. gridpoints on the horizontal interval [a1; a1]
N no. PML layers
mw total no. gridpoints in horizontal (vertical) direction = n1 + 2N
n total no. of unknowns = m2w
Table I: Summary of variables.
26
0 5 10 15 20 25 30 35 40 4510
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
iterations
rela
tive
resid
ual n
orm
Example 1, 10 ppw
solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9
Figure 2: Relative residual norm per iteration for case h = 2:45cm and q = 6; 7; 8; 9, Example 1.
27
0 5 10 15 20 25 30 35 4010
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
iteration
rela
tive
resid
ual n
orm
solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9
Example 1, 20 ppw
Figure 3: Relative residual norm per iteration for case h = 1:23cm and q = 6; 7; 8; 9, Example 1.
28
m2w h = 2:45cm h = 1:23cm
unknowns f0 = 0 f0 random rank f0 = 0 f0 random rank
6,241 31 42.8 174 35 51.8 194
20,449 30 48 302 34 56 322
73,441 32 57.4 558 33 65.2 578
277,729 41 71 1070 39 76.2 1090
Table II: Number of iterations (with starting guess f0 = 0) and average number of iterations (with
random starting guess) until convergence for varying number of unknowns and two dierent values
of h, Example 1. Columns headed by \rank" give the corresponding rank of I AM1, the upper
bound on the number of iterations until convergence in exact arithmetic.
29
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=6
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=7
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=8
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=9
Figure 4: Non-unit eigenvalues of AM1, case h = 2:45cm and q = 6; 7; 8; 9, Example 1.
30
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=6
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=7
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=8
0 0.5 1 1.5 2 2.5−0.5
0
0.5
1
real
imag
q=9
Figure 5: Non-unit eigenvalues of AM1, case h = 1:23cm and q = 6; 7; 8; 9, Example 1.
31
−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2−0.4
−0.2
0
0.2
0.4
0.6
0.8
Figure 6: x's represent non-unit eigenvalues of AM1, case h = 1:23cm and q = 9, Example 1, and
o's represent those eigenvalues selected to serve as roots of a polynomial.
32
0 5 10 15 20 25 3010
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
iteration
rela
tive
resid
ual n
orm
Example 2, 10 ppw
solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9
Figure 7: Relative residual norm per iteration for case h = 1:34cm and q = 6; 7; 8; 9; Example2.
33
0 5 10 15 20 25 30 3510
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
iteration
rela
tive
resid
ual n
orm
Example 2, 20 ppw
solid: q=6dashed: q=7dash−dotted: q=8dotted: q=9
Figure 8: Relative residual norm per iteration for case h = 0:671cm and q = 6; 7; 8; 9, Example 2.
34
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
q=6
real
imag
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=7
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=8
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=9
Figure 9: Non-unit eigenvalues of AM1, case h = 1:34cm and q = 6; 7; 8; 9, Example 2.
35
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=6
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=7
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=8
−0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
real
imag
q=9
Figure 10: Non-unit eigenvalues of AM1, case h = 0:671cm and q = 6; 7; 8; 9, Example 2.
36
m2w h = 1:34cm h = 0:671cm
unknowns f0 = 0 f0 random rank f0 = 0 f0 random rank
6,241 26 55.6 188 32 61.6 248
20,449 27 61.2 316 30 71.4 376
73,441 27 74.8 572 33 82.6 632
277,729 26 88 1084 34 104.6 1144
Table III: Number of iterations (with starting guess f0 = 0) and average number of iterations (with
random starting guess) until convergence for varying number of unknowns and two dierent values
of h, Example 2. Columns headed by \rank" give the corresponding rank of I AM1, the upper
bound on the number of iterations until convergence in exact arithmetic.
37