multigrid methods for toeplitz linear systems with...

22
BIT manuscript No. (will be inserted by the editor) Multigrid methods for Toeplitz linear systems with different size reduction Marco Donatelli · Stefano Serra-Capizzano · Debora Sesana Received: date / Accepted: date Abstract Starting from the spectral analysis of g-circulant matrices, we study the convergence of a multigrid method for circulant and Toeplitz matrices with various size reductions. We assume that the size n of the coefficient matrix is divisible by g 2 such that at the lower level the system is reduced to one of size n/g, by employing g-circulant based projectors. We perform a rigorous two-grid convergence analysis in the circulant case and we extend experimentally the results to the Toeplitz setting, by employing structure preserving projectors. The optimality of the two-grid method and of the multigrid method is proved, when the number θ N of recursive calls is such that 1 < θ < g. The previous analysis is used also to overcome some pathological cases, in which the generating function has zeros located at “mirror points” and the standard two-grid method with g = 2 is not optimal. The numerical experiments show the correctness and applicability of the proposed ideas, both for circulant and Toeplitz matrices. Keywords circulant, g-circulant and Toeplitz matrices, two-grid and multigrid methods Mathematics Subject Classification (2000) 65N55 · 65F10 · 65F15 This is a preprint of an article in press in BIT M. Donatelli Dipartimento di Fisica e Matematica, Universit` a dell’Insubria - Sede di Como, Via Valleggio 11, 22100 Como, Italy E-mail: [email protected] S. Serra-Capizzano Dipartimento di Fisica e Matematica, Universit` a dell’Insubria - Sede di Como, Via Valleggio 11, 22100 Como, Italy E-mail: [email protected]; [email protected] D. Sesana Dipartimento di Scienze Economiche e Metodi Quantitativi, Universit` a degli Studi del Piemonte Orientale - Amedeo Avogadro, Via Perrone 18, 28100 Novara, Italy E-mail: [email protected]

Upload: truongtram

Post on 23-May-2019

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

BIT manuscript No.(will be inserted by the editor)

Multigrid methods for Toeplitz linear systems with differentsize reduction⋆

Marco Donatelli · Stefano Serra-Capizzano·Debora Sesana

Received: date / Accepted: date

Abstract Starting from the spectral analysis ofg-circulant matrices, we study theconvergence of a multigrid method for circulant and Toeplitz matrices with varioussize reductions. We assume that the sizen of the coefficient matrix is divisible byg≥2 such that at the lower level the system is reduced to one of sizen/g, by employingg-circulant based projectors. We perform a rigorous two-grid convergence analysis inthe circulant case and we extend experimentally the resultsto the Toeplitz setting, byemploying structure preserving projectors. The optimality of the two-grid method andof the multigrid method is proved, when the numberθ ∈ N of recursive calls is suchthat 1< θ < g. The previous analysis is used also to overcome some pathologicalcases, in which the generating function has zeros located at“mirror points” and thestandard two-grid method withg= 2 is not optimal. The numerical experiments showthe correctness and applicability of the proposed ideas, both for circulant and Toeplitzmatrices.

Keywords circulant,g-circulant and Toeplitz matrices, two-grid and multigridmethods

Mathematics Subject Classification (2000)65N55· 65F10· 65F15

⋆ This is a preprint of an article in press in BIT

M. DonatelliDipartimento di Fisica e Matematica, Universita dell’Insubria - Sede di Como, Via Valleggio 11, 22100Como, ItalyE-mail: [email protected]

S. Serra-CapizzanoDipartimento di Fisica e Matematica, Universita dell’Insubria - Sede di Como, Via Valleggio 11, 22100Como, ItalyE-mail: [email protected]; [email protected]

D. SesanaDipartimento di Scienze Economiche e Metodi Quantitativi,Universita degli Studi del Piemonte Orientale- Amedeo Avogadro, Via Perrone 18, 28100 Novara, ItalyE-mail: [email protected]

Page 2: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

2 Marco Donatelli et al.

1 Introduction

In the last 20 years multigrid methods have gained a remarkable reputation as fastsolvers for structured matrices associated to shift invariant operators, where the sizen is large and the system shows a conditioning growing polynomially with n (see [10,24,11,5,15,16,29,2,27,14] and references therein). Under suitable mild assump-tions, the considered techniques are optimal showing linear or almost linear (O(nlogn)arithmetic operations as the celebrated fast Fourier transform (FFT)) complexity forreaching the solution within a preassigned accuracy and a convergence rate indepen-dent of the sizen of the involved system. These excellent features are identical inthe multilevel setting, as already known for linear systemsarising in the context ofelliptic ordinary and partial differential equations (see[13,22,31,25] and referencestherein). In particular, if the underlying structures are also sparse as in the multilevelbanded case, then the cost of solving the involved linear system is proportional to theorder of the coefficient matrix, with constant depending linearly on the bandwidthsat each level. We mention that the cost of direct methods isO(nlogn) operations in

the case of trigonometric matrix algebras (circulant,τ, . . . ) and it isO(n3d−1

d ) ford-level Toeplitz matrices (see [17]). Concerning multilevel Toeplitz structures, su-perfast methods represent a good alternative, even if the algorithmic error has to be

controlled, and the cost grows asO(n3d−2

d log2(n)): in fact, the computational burdenis really competitive ford = 1, while the deterioration is evident ford > 1, since it isnontrivial to exploit the structure at the inner levels (see[7] and references therein andrefer to [32] for recent interesting findings on the subject). Moreover, in the last casethe most popular preconditioning strategies, with preconditioners taken in matrix al-gebras with unitary transform, can be far from being optimalin the multidimensionalcase (see [28] and [26,21,32] for further results). On the other hand, multigrid meth-ods are optimal also for polynomially ill-conditioned multidimensional problems.Furthermore, these techniques can be extended to the case oflow rank correctionsof the considered structured matrices, allowing to deal also with the modified Strangpreconditioner widely used in the literature (see [6] and references therein).

The main novelty contained in the literature treating structured matrices is theuse of the symbol. Indeed, starting from the initial proposal in [10], we know that theconvergence analysis of the two-grid and V-cycle can be handled in a compact andelegant manner by studying few analytical properties of thesymbol (so the study doesnot involve explicitly the entries of the matrix and, more importantly, the sizen of thesystem). Already in the two-grid convergence analysis, it is evident that the optimalitycan be reached only if the symbolf has a finite number of zeros of finite order andnot located at mirror points: more explicitly, ifx0 is a zero off then f (x0+π) must begreater than zero. Here we show that the second requirement is not essential since itdepends on the choice of projecting the original matrix of sizen into a new one of sizen/2. The latter is not compulsory so that, by choosing a different size reduction fromn to n/g andg > 2, we can overcome the pathology induced by the mirror points. Adifferent approach for dealing with such pathologies was proposed in [5] and furtheranalyzed in [16], by exploiting a projection strategy basedon matrix-valued symbols.

Page 3: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 3

In this paper we analyze the convergence of a multigrid method where the fineproblem of sizen is projected to a coarser problem of sizen/g, g = 2,3, . . . . Non-standard projection matrices withg > 2 arise in a natural way when the aggregationprojectors are applied to finite difference or regular finiteelements approximations ofpartial differential equations (see [18] and references therein). We perform a two-gridanalysis using the ideas in [27] for circulant structures and by exploiting the spectralanalysis ofg-circulant matrices already performed in [20]. As shown in [9], suchtwo-grid analysis is an algebraic generalization of the classical local Fourier analy-sis, which can be extended from circulant matrices to the more challenging case ofToeplitz matrices. In the multigrid strategy that we propose the coarse problem hassizen/gwith g> 2 and, as an interesting byproduct, we can design multigrid methodswith optimal cost and characterized byg−1 recursive calls: in fact, it is enough toperform the analysis of the arithmetic computational cost related to the size reductionn/g between two consecutive levels and to invoke the results in [31]. A further prop-erty of the proposed multigrid is that the pathologies induced by the mirror pointsare bypassed as previously mentioned. Our proposal can be plainly extended to themultilevel case by tensor product arguments, by taking intoconsideration the largernumber of “mirror points” (which is exponential in the number d of levels). Moreovera V-cycle convergence analysis could be performed by following the steps in [2,1] asa model. A rigorous study in this directions will be the subject of future research.

The paper is organized as follows. In Section 2 we set the problem by recallingthe main features of the circulant, Toeplitz andg-circulant matrices. In Section 3 wereport definitions and classical convergence results concerning two-grid and multi-grid iterations from [22,31]. In Section 4 we define our grid-transfer operators and westudy the properties of the coarse matrix obtained by the Galerkin approach. Section 5is devoted to the proof of convergence of our multigrid when applied to circulant ma-trices and to briefly discuss the pathologies, that are eliminated by our algorithmicproposal. Section 6 is concerned with numerical experiments regarding circulant andToeplitz matrices (e.g. ill-conditioned linear systems coming from approximated dif-ferential and integral problems). Section 7 is devoted to conclusions and to sketchfuture lines of research.

2 Circulant matrices and other related structures

A matrixCn of sizen is called circulant if its entries obey the ruleCn =[b(r−s) modn

]n−1r,s=0

.

Let f be a trigonometric polynomial defined over the setQ = [0,2π) and havingdegreec≥ 0, i.e., f (u) = ∑c

k=−c akeiku, i2 = −1. From the Fourier coefficients off ,that is

a j =1

Qf (u)e−i judu, j ∈ Z, (2.1)

one can build the circulant matrixCn( f ) =[a(r−s) modn +a(r−s) modn−n

]n−1r,s=0

. For ex-

ample, we takef (u) = 3−2eiu +e−2iu +4e3iu. The degree off is c = 3 and we have

Page 4: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

4 Marco Donatelli et al.

a0 = 3, a1 = −2, a3 = 4 anda−2 = 1; if we taken = 5, the circulant matrixC5( f ) isgiven by

C5( f ) =

3 0 5 0 −2−2 3 0 5 00 −2 3 0 55 0 −2 3 00 5 0 −2 3

.

It is clear that the Fourier coefficienta j equals zero if the condition| j| ≤ c isviolated. The matrixCn( f ) is said to be the circulant matrix of ordern generated byf , and can be written asCn( f ) = ∑| j |≤c a jZ

jn, where the matrix

Zn =

0 · · · 0 11 0

......

0 1 0

,

called the circulant generator, is a cyclic permutation Toeplitz matrix. In addition, ifi is the imaginary unit that isi2 = −1 and ifFn denotes the Fourier matrix of sizen,i.e.

Fn =1√n

[e−

2π irsn

]n−1

r,s=0, (2.2)

then it is well known (see e.g. [8]) that

Cn( f ) = Fn∆n( f )FHn , (2.3)

where

∆n( f ) = diagj=0,...,n−1

f(

x(n)j

), x(n)

j =2π jn

, (2.4)

= diag(√

nFHn a), a = [a0,a1, . . . ,an−1]

T ,

a being the first column of the matrixCn( f ).Under the assumption thatc≤⌊(n−1)/2⌋, the matrixCn( f ) is the Strang or natu-

ral circulant preconditioner of the corresponding Toeplitz matrixTn( f ) =[a(r−s)

]n−1r,s=0

(see [6] and references therein). We observe that the above-mentioned assumptionc ≤ ⌊(n−1)/2⌋ is fulfilled for n large enough, sincec is a fixed constant andn isthe matrix order: in actuality, in real applications it is natural to suppose thatn islarge, if we assume thatCn( f ) comes from an approximation process of an infinite-dimensional problem. Furthermore, if the symbolf has a zero at zero (this happensin the case of approximation of differential operators), thenCn( f ) is singular and itis usually replaced by a rank-one correction that forces invertibility: in the relevantliterature, the latter is called modified Strang preconditioner.

We end this section with the definition ofg-circulant matrix. A matrixCn,g of

sizen is calledg-circulantif its entries obey the ruleCn,g =[b(r−gs) modn

]n−1r,s=0

: for an

Page 5: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 5

introduction and for the algebraic properties of such matrices refer to Section 5.1 ofthe classical book by Davis [8], while new additional results can be found in [20,30]and references therein. For instance, ifn = 5 andg = 3, then

Cn,g =

b0 b2 b4 b1 b3

b1 b3 b0 b2 b4

b2 b4 b1 b3 b0

b3 b0 b2 b4 b1

b4 b1 b3 b0 b2

.

Also in this case, as in ordinary circulant setting, if the coefficientsa j , j ∈Z, arisefrom a given symbolf (see (2.1)), we can build theg-circulant matrix generated byfasCn,g( f ) =

[a(r−gs) modn +a(r−gs) modn−n

]n−1r,s=0

. For instance, withf (u) = 3−2eiu +

e−2iu +4e3iu, we finda0 = 3, a1 = −2, a3 = 4 anda−2 = 1 so that

C5,3( f ) =

3 0 0 −2 5−2 5 3 0 00 0 −2 5 35 3 0 0 −20 −2 5 3 0

.

3 Two-grid and Multigrid methods

Let An ∈ Cn×n, andxn, bn ∈ C

n. Let pkn ∈C

n×k, k < n, be a given full-rank matrix andlet us consider a class of iterative methods of the form

x( j+1)n = Vnx( j)

n + bn := V (x( j)n , bn), (3.1)

whereAn = Wn−Nn, Wn nonsingular matrix,Vn := In−W−1n An ∈ Cn×n, andbn :=

W−1n bn ∈ Cn. A Two-Grid Method (TGM) is defined by the following algorithm:

TGM(Vνpren,pre,V

νpostn,post, pk

n)(x( j)n )

0. xn = Vνpre

n,pre(x( j)n , bn,pre)

1. dn = Anxn−bn

2. dk = (pkn)

Hdn

3. Ak = (pkn)

HAnpkn

4. SolveAky = dk

5. xn = xn− pkny

6. x( j+1)n = V

νpostn,post(xn, bn,post)

Steps 1. → 5. define the “coarse grid correction” that depends on the projectingoperatorpk

n, while Step 0. and Step 6. consist, respectively, in applyingνpre timesandνpost times a “pre-smoothing iteration” and a “post-smoothing iteration” of thegeneric form given in(3.1). The global iteration matrix of the TGM is then given by

TGM(Vνpren,pre,V

νpostn,post, pk

n) = Vνpostn,post

[In− pk

n

((pk

n)HAnpk

n

)−1(pk

n)HAn

]V

νpren,pre.

Page 6: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

6 Marco Donatelli et al.

If k is large, the numerical solution of the linear system at the Step 4. could becomputationally expensive. In such case a multigrid procedure is adopted. Consider0 < m < n, the sequence 0< nm < nm−1 < · · · < n1 < n0 = n and the full-rankmatricespni

ni−1 ∈ Cni−1×ni , for i = 1, . . . ,m. The multigrid method produces the se-

quencex(k)n k∈N defined byx( j+1)

n = MGM(Vνpren,pre,V

νpostn,post, pn1

n ,An,bn,θ ,0)(x( j)n ) with

the function MGM defined recursively as follows:

x( j+1)ni = MGM(V

νpreni ,pre,V

νpostni ,post, pni+1

ni ,Ani ,bni ,θ , i)(x( j)ni )

If i = m then SolveAni x( j+1)ni = bni

Else

0. xni = Vνpre

ni ,pre(x( j)ni , bni ,pre)

1. dni = Ani xni −bni

2. dni+1 = (pni+1ni )Hdni

3. Ani+1 = (pni+1ni )HAni p

ni+1ni

4. x( j+1)ni+1 = 0

for s= 1 to θx( j+1)

ni+1 = MGM(Vνpreni+1,pre,V

νpostni+1,post, pni+2

ni+1,Ani+1,dni+1,θ , i +1)(x( j+1)ni )

5. xni = xni − pni+1ni x( j+1)

ni+1

6. x( j+1)ni = V

νpostni ,post(xni , bni ,post)

The choicesθ = 1 andθ = 2 correspond to the well-known V-cycle and W-cycle,respectively.

In the present paper, we are interested in proposing such a kind of techniques inthe case whereAn is a Toeplitz matrix. However, for a theoretical analysis, we con-sider circulant matrices according to the local Fourier analysis for classical multigridmethods (see [9]). Even if we treat in detail the circulant case, in the spirit of thepaper [2], the same ideas can be plainly translated to other matrix algebras associatedto (fast) trigonometric transforms. First we recall some convergence results from thetheory of the algebraic multigrid method given in [22].

By ‖ · ‖2 we denote the Euclidean norm onCn and the associated induced matrixnorm overCn×n. If X is positive definite,‖ · ‖X = ‖X1/2 · ‖2 denotes the Euclideannorm weighted byX onCn and the associated induced matrix norm. Finally, ifX andY are Hermitian matrices, then the notationX ≤ Y means thatY−X is nonnegativedefinite. In the following we use some functional norms: moreprecisely the usualL∞

norm‖ ·‖∞ defined as‖ f‖∞ = supx∈Q | f (x)|, and the weightedL1 norm‖ ·‖1 definedas‖ f‖1 = 1

2π∫

Q | f (x)|dx (according to the Haar measure).

Theorem 3.1 ([22])Let An be a positive definite matrix of size n and let Vn be definedas in theTGM algorithm. Suppose that there existsαpost> 0 independent of n suchthat

‖Vn,postxn‖2An

≤ ‖xn‖2An−αpost‖xn‖2

AnD−1n An

, ∀xn ∈ Cn, (3.2)

where Dn is the main diagonal of An. Assume that there existsγ > 0 independent of nsuch that

miny∈Ck

‖xn− pkny‖2

Dn≤ γ‖xn‖2

An, ∀xn ∈ C

n. (3.3)

Page 7: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 7

Thenγ ≥ αpost and

‖TGM(I ,Vνpostn,post, pk

n)‖An ≤√

1−αpost/γ.

Conditions (3.2) and (3.3) are usually called as “smoothingproperty” and “approxi-mation property”, respectively.

We note thatαpost andγ are independent ofn and hence, if the assumptions ofTheorem 3.1 are satisfied, then the resulting TGM is not only convergent but alsooptimal. In other words, the number of iterations in order toreach a given accuracyε can be bounded from above by a constant independent ofn (possibly depending onthe parameterε).

Of course, if the given method is complemented with a convergent pre-smoother,then by the same theorem we get a faster convergence. In fact,it is known that forsquare matricesA andB the spectra ofABandBAcoincide.

Therefore TGM(Vνpren,pre,V

νpostn,post, pk

n) and TGM(I ,Vνpren,preV

νpostn,post, pk

n) have the same ei-genvalues so that

‖TGM(Vνpren,pre,V

νpostn,post, pk

n)‖An = ‖TGM(I ,Vνpren,preV

νpostn,post, pk

n)‖An

≤√

1−αnewpost/γ ≤

√1−αpost/γ,

and hence the presence of a pre-smoother can only improve theconvergence.Concerning multigrid methods, in [22] the V-cycle convergence is considered

with a result which could be seen as the analog of Theorem 3.1.For other boundsconcerning the convergence rate of the V-cycle see [19] and reference therein. Re-garding the convergence of the W-cycle, we point out that a rigorous TGM analysisis sufficient for determining the optimality of the W-cycle (see [31]).

4 Projecting operators for circulant matrices

Let An := Cn( f ) be a circulant matrix generated by a trigonometric polynomial f . Inorder to provide a general method for obtaining a projectingoperator (also called pro-jector) from an arbitrary banded circulant matrixPn, for some bandwidth independentof n, we introduce the operatorZk

n,g ∈ Rn×k, k≤ n, where

Zkn,g = [δi−g j]i, j , δr =

1 if r ≡ 0 (modn), i = 0, . . . ,n−1,0 otherwise, j = 0, . . . ,k−1.

(4.1)

The operatorZkn,g represents a special link between the space of the frequencies of

sizen and the corresponding space of frequencies of sizek (see also Section 6.1 and[10] where such kind of operators were called cutting matrices or cutting operators).

Lemma 4.1 Let Fn be the Fourier matrix of size n defined in(2.2) and let Zkn,g ∈Rn×k

be the matrix defined in(4.1). If k = ng ∈ N then

FHn Zk

n,g =1√g

In,gFHk , (4.2)

Page 8: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

8 Marco Donatelli et al.

where In,g ∈ Rn×k and

In,g =

IkIk...Ik

g times,

with Ik being the identity matrix of size k.

This simple relation (see [20, Lemma 3.3 and Remark 3.5] for the details of theproof and [30] for recent findings on these structures) is thekey step in defining analgebraic multigrid method, since it allows us to obtain again a circulant matrix at thelower level. Indeed, denoting by∆n the diagonal matrix obtained from the eigenvaluesof An (see(2.4)), we infer that∆k := 1

gITn,g∆nIn,g is again a diagonal matrix. Therefore

(Zkn,g)

HAnZkn,g = (Zk

n,g)HFn∆nFH

n Zkn,g

=1g

FkITn,g∆nIn,gFH

k

= Fk∆kFHk

= Ak,

whereAk is a new circulant matrix. Consequently, starting from the matrix Zkn,g, it is

possible to define a generic projector

pkn,g = PnZk

n,g, (4.3)

wherePn is a circulant matrix. IndeedPHn AnPn is a circulant matrix and thenAk =

(pkn,g)

HAnpkn,g is a circulant matrix of sizek. We note that, sincek = n

g ∈ N, n must bea multiple ofg. We are left to determine the conditions to be satisfied byPn = Cn(p)(or better by its generating functionp), in order to get a projector which is effectivein terms of convergence.

Definition 4.1 Givenx∈ [0,2π), g∈ N, g≥ 2, the set ofg-cornersof x is Ωg(x) =

y = x+ 2π jg (mod 2π) | j = 0, . . . ,g−1 and the set ofg-mirror pointsis Mg(x) =

Ωg(x)\ x.

TGM conditions Let An := Cn( f ) with f nonnegative, trigonometric polynomialand let pkn,g = Cn(p)Zk

n,g with p trigonometric polynomial. Assume that f(x0) = 0 forx0 ∈ [0,2π), choose p such that the following relations

limx→x0

p2(y)f (x)

< ∞, ∀y∈ Mg(x), (4.4)

∑y∈Ωg(x)

p2(y) > 0, ∀x∈ [0,2π), (4.5)

are fulfilled.

Page 9: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 9

If f has a unique zerox0 ∈ [0,2π), then we setPn = Cn(p) wherep is a trigono-metric polynomial defined as

p(x) = ∏x∈Mg(x0)

(2−2cos(x− x))⌈β/2⌉ ∼ ∏x∈Mg(x0)

|x− x|2⌈β/2⌉, (4.6)

for x∈ [0,2π), with

β ≥ βmin = min

i

∣∣∣∣ limx→x0

|x−x0|2i

f (x)< +∞

,

thus conditions(4.4) and(4.5) are satisfied.Before proving (in Section 5.1) that conditions (4.4) and (4.5) are sufficient to

assure the TGM optimality, we consider a crucial result bothfrom a theoretical and apractical point of view.

Proposition 4.1 Let f be a nonnegative function, k= ng ∈ N, pk

n,g = Cn(p)Zkn,g ∈

Cn×k, with p trigonometric polynomial satisfying condition(4.4) for any zero of f

and globally the condition(4.5). Then the matrix(pkn,g)

HCn( f )pkn,g ∈ Ck×k coincides

with Ck( f ) where f is nonnegative and

f (x) =1g ∑

y∈Ωg

(xg

)f (y)|p|2(y), (4.7)

for x ∈ [0,2π), i.e., the projected matrix is obtained picking every g-th entry out ofthe symbol f|p|2. In particular

1. if f is a polynomial thenf is a polynomial with a fixed degree⌊

qg

⌋, where q is the

degree of f|p|2;2. if x0 is a zero of f thenf has a corresponding zero y0 where y0 = gx0 (mod2π);3. the order of the zero y0 of f is exactly the same as the one of the zero x0 of f , so

that at the lower level the new projector can be easily definedin the same way.

Proof From(2.3) and(2.4), we find

(pkn,g)

HCn( f )pkn,g = (Zk

n,g)H(Cn(p))HCn( f )Cn(p)Zk

n,g

= (Zkn,g)

HCn( f |p|2)Zkn,g.

Thus the generating function of the circulant matrix(Cn(p))HCn( f )Cn(p) is f |p|2.Denoting bya j the j-th Fourier coefficient off |p|2, we have

Cn( f |p|2) =[a(r−s) modn +a(r−s) modn−n

]n−1r,s=0

,

Page 10: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

10 Marco Donatelli et al.

and hence, by(4.1), the entries of the matrix(Zkn,g)

HCn( f |p|2)Zkn,g are given by

[Cn( f |p|2)Zk

n,g

]

r,s=

n−1

∑ℓ=0

[Cn( f |p|2)]r,ℓ[Zkn,g]ℓ,s

=n−1

∑ℓ=0

(a(r−ℓ) modn +a(r−ℓ) modn−n

)δℓ−gs

=(a) a(r−gs) modn +a(r−gs) modn−n,

r = 0, . . . ,n−1, s= 0, . . . ,k−1,

[(Zk

n,g)HCn( f |p|2)Zk

n,g

]r,s

=n−1

∑ℓ=0

[(Zkn,g)

H ]r,ℓ[Cn( f |p|2)Zkn,g]ℓ,s

=n−1

∑ℓ=0

δℓ−gr(a(ℓ−gs) modn +a(ℓ−gs) modn−n

)

=(b) a(gr−gs) modn +a(gr−gs) modn−n,

r,s= 0, . . . ,k−1,

where (a) is true because there exists a uniqueℓ ∈ 1,2, . . . ,n−1 such thatℓ−gs≡ 0(mod n), that is,ℓ ≡ gs (mod n) and, since 0≤ gs≤ n− 1, we obtainℓ = gs. Thesame argument is applicable for showing the validity of (b).Now if we denote byb j

the Fourier coefficients off it only remains to show that[Ck( f )]r,c = b(r−c) modk +

b(r−c) modk−k = a(gr−gc) modn + a(gr−gc) modn−n, r,c = 0, . . . ,k− 1. Since f |p|2 is apolynomial, we can always write

f |p|2(x) =∞

∑ℓ=−∞

aℓeiℓx, f (x) =

∑ℓ=−∞

bℓeiℓx. (4.8)

From(2.1), (4.7) and(4.8), we have

br−c =1

∫ 2π

0

1g

g−1

∑j=0

+∞

∑ℓ=−∞

aℓeiℓ(

x+2π jg

)

e−i(r−c)xdx

=1

2πg

∫ 2π

0

+∞

∑ℓ=−∞

aℓ

(g−1

∑j=0

ei2πℓ j

g

)e

iℓxg e−i(r−c)xdx.

Taking into account

1g

g−1

∑j=0

ei2πℓ j

g =

1 if ℓ = gt0 otherwise

and1

∫ 2π

0eiℓxdx=

1 if ℓ = 00 otherwise

,

we obtain

br−c =1

2πg

∫ 2π

0

+∞

∑t=−∞

agtgeigtxg e−i(r−c)xdx

=+∞

∑t=−∞

agt1

∫ 2π

0eix(t−(r−c))dx

= ag(r−c). (4.9)

Page 11: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 11

So, from (4.9), sincegk = n andg((r − c) modk) = (gr− gc) modn, we haveb(r−c) modk = a(gr−gc) modn and, similarly,b(r−c) modk−k = a(gr−gc) modn−n.

From the expression off , since f (x0) = 0 we deducep(y) = 0 ∀y ∈ Mg(x0),whose validity is necessary in order to satisfy relationships (4.4). Thusy0 = gx0 (mod2π) is a zero off (i.e. item 2. is proved).

Moreover, by(4.5), we deduce thatp2(x0) > 0 sincep2(y) = 0, ∀y ∈ Mg(x0),

and the order of the zeroy0 of f(

xg

)|p|2

(xg

)is the same as the order off (x) at x0.

Furthermore, by(4.4) we see that|p|2(

x+2πkg

)has aty0 a zero of order at least equal

to the one off (x) at x0, for anyk = 1, . . . ,g−1. Since all the contributions inf arenonnegative the thesis in item 3. is proved.

Finally we have to prove item 1. Sinceb j are the Fourier coefficients off anda j

are the Fourier coefficients of the polynomialf |p|2 (see(4.8)), from (4.9) we deducethat

f (x) = ∑j

b jei jx = ∑

jag je

i jx .

Hence, if the polynomialf |p|2 has degreeq, then f has degree at most⌊

qg

⌋.

5 Proof of convergence

Using the results in Section 4, we prove the optimality of theTGM and of the W-cycle(for the W-cycle the conditiong > 2 is required).

5.1 TGM convergence

The smoothing property forg = 2 was proved in [27] and the proof does not changefor g > 2.

Lemma 5.1 ([27])Let An := Cn( f ) with f being a nonnegative trigonometric poly-nomial (not identically zero) and let Vn := In−ωAn, 0 < ω < 2/‖ f‖∞. If we chooseαpost so thatαpost≤ a0ω(2−ω‖ f‖∞), then relation(3.2) holds true.

If in the previous Lemma we chooseω = ‖ f‖−1∞ , thenαpost≤ ‖ f‖1/‖ f‖∞ and the

best value ofαpost is αpost,best= ‖ f‖1/‖ f‖∞. Moreover, the result of Lemma 5.1 canbe easily generalized when considering both pre-smoothingand post-smoothing as in[1].

The following result shows that TGM conditions (4.4) and (4.5) are sufficient inorder to satisfy the approximation property.

Theorem 5.1 Let An :=Cn( f ), with f being a nonnegative trigonometric polynomial(not identically zero), and let pkn,g = Cn(p)Zk

n,g be the projecting operator, with Zkn,gdefined in(4.1) and with p trigonometric polynomial, satisfying condition(4.4), forany zero of f , and satisfying globally condition(4.5). Then, there exists a positivevalueγ independent of n such that inequality(3.3) is satisfied.

Page 12: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

12 Marco Donatelli et al.

Proof The proof is similar to that of Lemma 8.2 in [25], but we reportit here forcompleteness. First, we recall that the main diagonal ofAn is given byDn = a0In witha0 = (2π)−1∫

Q f = ‖ f‖1 > 0, so that‖ · ‖2Dn

= a0‖ · ‖22.

In order to prove that there existsγ > 0 independent ofn such that for anyxn ∈Cn

miny∈Ck

‖xn− pkn,gy‖2

Dn= a0 min

y∈Ck‖xn− pk

n,gy‖22 ≤ γ‖xn‖2

An,

we chose a special instance ofy in such a way that the previous inequality is reducedto a matrix inequality in the sense of the partial ordering ofthe real space of theHermitian matrices. For anyxn ∈ Cn, let y≡ y(xn) ∈ Ck be defined as

y =[(pk

n,g)H pk

n,g

]−1(pk

n,g)Hxn.

Therefore,(3.3) is implied by

‖xn− pkn,gy‖2

2 ≤ (γ/a0)‖xn‖2An

, ∀xn ∈ Cn,

where the latter is equivalent to the matrix inequality

Wn(p)HWn(p) ≤ (γ/a0)Cn( f ), (5.1)

with Wn(p) = I − pkn,g

[(pk

n,g)H pk

n,g

]−1(pk

n,g)H . Since, by construction,Wn(p) is an

Hermitian unitary projector, it holds thatWn(p)HWn(p) =W2n (p) =Wn(p). As a con-

sequence, inequality (5.1) can be rewritten as

Wn(p) ≤ (γ/a0)Cn( f ). (5.2)

Whenk = ng ∈ N, following the decomposition in(4.2), pk

n,g = Cn(p)Zkn,g can be

expressed according to

(pkn,g)

H =1√g

Fk

[∆ (0)

p |∆ (1)p | · · · |∆ (g−1)

p

]FH

n ,

where

∆ (r)p = diag

j=0,...,k−1

(p(x(n)

rk+ j ,n))

, r = 0, . . . ,g−1,

with x(n)j = 2π j/n.

Let p[µ ] ∈ Cg whose entries are given by the evaluations ofp over the points

of Ω(x(n)µ ), for µ = 0, . . . ,k− 1. There exists a suitable permutation by rows and

columns ofFHn Wn(p)Fn, such that we can obtain ag×g block diagonal matrix whose

µ th diagonal block is given byIg− p[µ ](p[µ ])T/‖p[µ ]‖22. Therefore, using the same

notation for f [µ ] and denoting by diag( f [µ ]) the diagonal matrix having the vectorf [µ ] on the main diagonal, the condition(5.2) is equivalent to

Ig−p[µ ](p[µ ])T

‖p[µ ]‖22

≤ (γ/a0)diag( f [µ ]), (5.3)

Page 13: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 13

for µ = 0, . . . ,k−1. By the Sylvester inertia law [12], the relation (5.3) is satisfied ifevery entry of

diag( f [µ ])−1/2(

Ig−p[µ ](p[µ ])T

‖p[µ ]‖22

)diag( f [µ ])−1/2

is bounded in modulus by a constant, which follows from the TGM conditions (4.4)and (4.5).

Furthermore, if we put

z = maxy∈Ωg(x)

∥∥∥∥p2(y)f (x)

∥∥∥∥∞

,

h =

∥∥∥∥∥1

∑y∈Ωg(x) p2(y)

∥∥∥∥∥∞

,

then condition(3.3) is satisfied by choosing a value ofγ such thatγ ≥ g(g−1)a0hz.

Combining Lemma 5.1 and Theorem 5.1 with Theorem 3.1, it follows that theTGM convergence speed does not depend on the size of the linear system.

5.2 Multigrid convergence

The optimal TGM convergence rate proved in Theorem 5.1 can beextended to ageneric recursion level of the multigrid procedure obtaining the so called “level in-dependency” property. The key tools are Proposition 4.1 andan explicit choice ofthe projector: for instance a good choice is obtained by considering the symbolpreported in (4.6). Indeed, the “level independency” was already proved in literaturefor g = 2 (see [5,4,2]) and the proof can be extended tog > 2, as in Theorem 5.1.

The “level independency” implies that the W-cycle has a constant convergencerate independent of the problem size [31]. However, the factthat the convergencespeed does not depend on the size of the linear system does notimply the optimalityof the method, because the computational work at each iteration is not taken intoaccount.

For estimating the computational work at each iteration of amultigrid method,we have to consider the size of the coarse problem and the number θ of recursivecalls. In our case the size of the problem at the leveli is ni = gni−1. According to theanalysis in [31], we assume that the multigrid components (smoothing, projection,. . . ) require a number of arithmetic operations which is bounded bycni, with c con-stant independent ofni. From equation(2.4.14) in [31], the total computational workCm(n) of one complete multigrid cycle is

Cm(n).=

gg−θ cn for θ < gO(nlogn) for θ = g

, (5.4)

where the symbol.= means equality up to lower order terms. It follows that forg = 2

the W-cycle can not be optimal even in the presence of “level independency”, because

Page 14: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

14 Marco Donatelli et al.

each multigrid iteration requires a computational cost ofO(nlogn) while the matrixvector product requiresO(n) arithmetic operations. On the other hand, forg = 3 theW-cycle shows a computational cost growing asCm(n)

.= 3cnand hence it is optimal

if the “level independency” is satisfied. More in general, the proposed multigrid willbe optimal for a numberθ ∈ N of recursive calls such that 1< θ < g.

In this paper we do not provide a convergence analysis for multilevel Circulantmatrices. Nevertheless, the computational cost of a multigrid iteration can be easilycomputed, up to lower order terms, for multidimensional problems as well. For ad-dimensional problem, for simplicity and without loss of generality, we assume thesame reductiong along each direction and so the size of the problem at the level i isni = gdni−1, whereni is the algebraic size of the problem at the leveli andn0 = n.Therefore equation (5.4) generalizes into

Cm(n).=

gd

gd−θ cn for θ < gd,

O(nlogn) for θ = gd.

It follows that each multigrid iteration has a linear computational cost inn only if1 ≤ θ < gd. In particular, for two-dimensional problems (d = 2), wheng ≥ 2 thecomputational cost of a W-cycle iteration (θ = 2) is linear inn but, for instance, thefirst order coefficient is equal to 2c if g = 2, while it is equal to 9c/7 if g = 3.

5.3 Some pathologies eliminated when usingg > 2

From conditions(3.4) and(3.5) in [27], we know that, forg = 2, if x0 is a zero off , then f (x0 + π) must be positive: otherwise relationship(3.5) in [27] cannot besatisfied no matter which polynomialp we choose. On the other hand, if we considerg = 3 then the presence of two zeros atx0 andx0+π is no longer a problem, becauseconditions(4.4) and (4.5) impose that, ifx0 is a zero of f , then f

(x0 + 2

3π)

andf(x0 + 4

3π)

must be positive, while we do not have constraints on the value f atx0 + π .

For g = 3, if f has a unique zerox0 ∈ [0,2π) of finite order, then we considerx = x0 + 2

3π (mod 2π) andx = x0 + 43π (mod 2π) and we setPn = Cn(p), wherep is

a trigonometric polynomial defined as

(5.5)

p(x) = (2−2cos(x− x))⌈β/2⌉(2−2cos(x− x))⌈β/2⌉ ∼ |x− x|2⌈β/2⌉|x− x|2⌈β/2⌉,

for x∈ [0,2π). In this case the parameterβ has to be chosen as

β ≥ βmin = min

i

∣∣∣∣ limx→x0

|x−x0|2i

f (x)< +∞

,

so that conditions(4.4) and (4.5) are satisfied. Iff shows more than one zero in[0,2π), then we consider a polynomialp which is the product of the basic polynomi-als of the same type reported in(5.5), satisfying condition(4.4) for any single zeroand satisfying globally condition(4.5).

Page 15: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 15

A different strategy was proposed in [5] where symmetric positive definite Toeplitzmatrices with entriesa1 = · · · = al = 0, l < n, are considered (like, for example,Toeplitz matrices generated byf (x) = 1− cos((l + 1)x)). A block symbol analy-sis and a related generalization was discussed in [16] mainly based on the inter-pretation ofTn( f ) with f (x) = 1− cos((l + 1)x) asTm(F) with m = n/l and withF(x) = (1− cos(x))Il , being a matrix-valued function. The projector is defined asal × l block matrix. The basic projector (4.1) is replaced by itsl × l block version,where 1 and 0 become the identity and the zero matrix of orderl , respectively. Thesymbol of the projector isp(x) = 1+cos((l +1)x) or, equivalently, the block symbolP(x) = (1−cos(x))Il . With this approach the size of the coarse problem is one halfofthe size of the finer problem. A Galerkin strategy forl > 1 leads to lose the Toeplitzstructure and so in [16] a natural coarse grid operator, previously proposed in [15],was employed.

Comparing the two strategies, usually the coarser matriceshave about the samesparsity. Therefore, our proposal withg = 3 has a lower computational cost for it-eration, since the size reduction factor is 3 instead of 2, but usually requires moreiterations to converge than the proposal in [5,16].

Example 5.1The symbol

f (x) = (2−2cos(x))(2+2cos(x)) = 2−2cos(2x) (5.6)

vanishes atx = 0 and atx = π with order two.Forg = 3, we haveM3(0) = 2π

3 , 4π3 andM3(π) = 5π

3 , 7π3 , thus the trigono-

metric polynomial

p(x) =1√3

∏x∈M3(0)

⋃M3(π)

(2−2cos(x− x)) (5.7)

satisfies the TGM conditions (4.4) and (4.5) and defines an optimal TGM. The onlynonzero Fourier coefficients ofp area0 =

√3, a±2 = 2/

√3, anda±4 = 1/

√3. The

coarse functionf in (4.7) is equal tof .The block projector proposed in [5] is

12

I2 2I2 I2 00 I2 2I2 I2 0

......

n×n/2

and the coarse matrix has the same symbolf , up to a scaling factor (see [16]). There-fore both strategies preserve the same symbolf on the coarser levels.

6 Numerical experiments

In this section, we apply the proposed multigrid method to symmetric positive def-inite circulant and Toeplitz systemsAnx = b. We choose as solution the vectorxsuch thatxi = i/n, i = 1, . . . ,n. The right-hand side vectorb is obtained accord-ingly. As smoother, we use Richardson withω j = 1/‖ f j‖∞, for j = 0, . . . ,m−1 (m

Page 16: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

16 Marco Donatelli et al.

is the number of subgrids in the algorithm,m = 1 for the TGM), for pre-smootherand the conjugate gradient for post-smoother. In the V-cycle and W-cycle procedurewhen the coarse grid size is less than or equal to 27, we solve the coarse grid sys-tem exactly. The zero vector is used as the initial guess and the stopping criterion is‖rs‖2/‖r0‖2 ≤ 10−7, wherers is the residual vector afters iterations and 10−7 is thegiven tolerance.

6.1 Cutting operators for Toeplitz matrices

When dealing with circulant matrices, using the projector defined in(4.3), the matrixat the lower level is still a circulant matrix, while for Toeplitz matrices, if we considerAn := Tn( f ) andpk

n,3 = Tn(p)Zkn,3, wherep is defined in accordance with the formula

(5.5) andk = n3 ∈ N, we find that

Tn(p)Tn( f )Tn(p) = Tn( f p2)+Gn( f , p).

Furthermore, if 2β + 1 is the bandwidth ofTn(p), the matrixGn( f , p) has rank atmost 2β and is formed by a matrix of rankβ in the upper left corner and a matrix ofthe same rank in the bottom right corner. According to the proposal in [2], we takea cutting matrix that will completely erase the contribution of Gn( f , p), so that, atthe lower level, the restriction of the matrixTn(p)Tn( f )Tn(p) is still a Toeplitz matrixand thus we can recursively apply the algorithm. The proposed cutting matrix is asfollows:

Zkn,3 =

0k′β

Zk′n−2β ,3

0k′β

n×k′

(k′defined below),

where 0k′

β is the zero matrix of sizeβ ×k′; Zkn,3 has the first and the lastβ rows equal

to zero and therefore it is able to remove corner correctionsof rank less than or equalto 2β . SinceGn( f , p) has rank 2β , we deduce thatAk′ = (Zk

n,3)HTn(p)Tn( f )Tn(p)Zk

n,3is Toeplitz and, in the generic case, we cannot obtain a Toeplitz matrix of size greaterthan this. As a consequence, for Toeplitz matrices, the projector is then defined as

pkn,3 = Tn(p)Zk

n,3.

Also the sizen of the problem should be chosen in such a way that a recursiveapplication of the algorithm is possible; in our case, if we choosen = 3α − ξ withξ = β −1, k = n

3 andk′ = k− 2(β−1)3 , then the size of the problem at the lower level

becomesk′ = n−2(β−1)3 = 3α−(β−1)−2(β−1)

3 = 3α−1− (β −1) = 3α−1− ξ .

Page 17: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 17

Table 6.1 Circulant case:f0(x) = (2−2cos(x))(2+2cos(x)).

n # iterationsTwo-grid V-cycle W-cycle

νpre = νpre = νpre = νpre = νpre = νpre =νpost= 1 νpost= 2 νpost= 1 νpost= 2 νpost= 1 νpost= 2

34 = 81 11 6 11 6 11 635 = 243 11 6 11 7 11 636 = 729 11 6 11 7 11 637 = 2187 11 6 11 7 11 6

6.2 Zero at the origin and atπ .

We present some examples where the generating functionsf0 vanish at the origin(x = 0) and atx = π . First, we consider the Example 5.1 where the symbol

f0(x) = (2−2cos(x))(2+2cos(x))

vanishes atx= 0 andx= π with order 2. According to(5.5), we choose the projectorpk

n,3 = Cn(p0)Zkn,3 if An is a circulant matrix andpk

n,3 = Tn(p0)Zkn,3 if An is a Toeplitz

matrix, wherep0 = p defined in (5.7). Settingx(1)0 = 0 andx(2)

0 = π , the position of

the new zerosx(i)j , for j = 1,2, . . . ,m with i = 1,2, moves according to Proposition

4.1 and, in this case, the functionsp j are equal top for every levelj = 0,1, . . . ,m−1.Furthermore, the particular choice ofp j implies that the coarse matrices have thesame symbol of the finer matrix, i.e.,f j = f0 for j = 1, . . . ,m. Tables 6.1 and 6.2 reportthe number of iterations required for convergence in the case of circulant and Toeplitzsystems, respectively. According to the convergence analysis in Section 5, Table 6.1shows an optimal behavior of the TGM and W-cycle. Moreover, Table 6.1 showsan optimal behavior also of the V-cycle that is not covered from our convergenceanalysis, but the latter is not surprising since the symbol of all coarse matrices doesnot change (f j = f0 for j = 1, . . . ,m). Table 6.2 shows that for Toeplitz matricesthe optimality is preserved only for the TGM and W-cycle, while for the V-cyclethe number of iterations slightly grows with the sizen. This is probably due to theproperties of the projectors described in Subsection 6.1, which are employed in orderto preserve the Toeplitz structure at the coarser levels.

In the second example we increase the order of the zero atx = π by choosing thefunction

f0(x) = (2−2cos(x))(2+2cos(x))2,

which has a zero atx = 0 of order 2 and one atx = π of order 4. The polynomialp0 = p defined in (5.7) still satisfies the TGM conditions (4.4) and (4.5). As in theprevious example, the functionsp j and f j do not change at the lower levels, i.e.,p j = p for j = 0, . . . ,m−1 while f j = f0 for j = 1, . . . ,m. In Tables 6.3 and 6.4 wereport the number of iterations required for convergence inthe case of circulant andToeplitz systems, respectively. Sincef0 has a zero of order 4 the condition number ofAn is of ordern4 and therefore forn = 37 we have a conditioning of magnitude 1013.

Page 18: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

18 Marco Donatelli et al.

Table 6.2 Toeplitz case.f0(x) = (2−2cos(x))(2+2cos(x))

n # iterationsTwo-grid V-cycle W-cycle

νpre = νpre = νpre = νpre = νpre = νpre =νpost= 1 νpost= 2 νpost= 1 νpost= 2 νpost= 1 νpost= 2

34−3= 78 24 14 24 14 24 1435−3= 240 24 15 35 20 28 1636−3= 726 24 15 43 24 29 1637−3= 2184 24 15 49 27 29 16

Table 6.3 Circulant case.f0(x) = (2−2cos(x))(2+2cos(x))2, tolerance=10−3

n # iterationsTwo-grid V-cycle W-cycle

νpre = νpre = νpre = νpre = νpre = νpre =νpost= 1 νpost= 2 νpost= 1 νpost= 2 νpost= 1 νpost= 2

34 = 81 20 9 20 9 20 935 = 243 20 9 18 9 20 936 = 729 20 9 18 9 20 937 = 2187 20 9 18 9 20 9

Table 6.4 Toeplitz case.f0(x) = (2−2cos(x))(2+2cos(x))2, tolerance=10−3

n # iterationsTwo-grid V-cycle W-cycle

νpre = νpre = νpre = νpre = νpre = νpre =νpost= 1 νpost= 2 νpost= 1 νpost= 2 νpost= 1 νpost= 2

34−3= 78 50 31 50 31 50 3135−3= 240 48 31 93 35 72 3236−3= 726 47 31 74 34 68 3137−3= 2184 47 31 76 34 68 31

Therefore, using double precision, for this example we choose a tolerance equal to10−3. This choice agrees also with the plots in Figures 6.1 and 6.2where we note anoptimal reduction of the residual norm only until about 10−3.

The last example of this subsection is taken from [5]. The generating function

f0(x) = 6−4cos(2x)−2cos(4x),

vanishes atx= 0 and atx= π with order 2. The symbol of the projector is againp0 =p defined in (5.7). The initial guess is a random vectoru such that 0≤ u j ≤ 1, the pre-smoother is a step of damped Jacobi with parameterω j = [A j ]1,1/‖ f (x)‖∞ while thepost-smoother is a step of damped Jacobi with parameterω j = 2[A j ]1,1/‖ f (x)‖∞ forj = 0, . . . ,m−1. The coarser problem is fixed such that it has size lower than6. Table6.5 shows that the number of iterations required to achieve the tolerance 10−7 remainsconstant, when increasing the sizen of the system like for the multigrid technique

Page 19: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 19

50 100 150 200 250 300 35010

−8

10−6

10−4

10−2

n=34

n=35

n=36

n=37

50 100 150 200 250 300 35010

−8

10−6

10−4

10−2

n=34

n=35

n=36

n=37

Fig. 6.1 Circulant: Graph of the residual in logarithmic scale of theV-cycle (left) and W-cycle (right)with different sizesn, with νpre = νpost = 1 and a fixed number of iterationsiter = 400; f0(x) = (2−2cos(x))(2+2cos(x))2.

50 100 150 200 250 300 350

10−7

10−6

10−5

10−4

10−3

10−2

10−1

n=34

n=35

n=36

n=37

50 100 150 200 250 300 35010

−8

10−6

10−4

10−2

n=34

n=35

n=36

n=37

Fig. 6.2 Toeplitz: Graph of the residual in logarithmic scale of the V-cycle (left) and W-cycle (right)with different sizesn, with νpre = νpost = 1 and a fixed number of iterationsiter = 400; f0(x) = (2−2cos(x))(2+2cos(x))2.

Table 6.5 Toeplitz case.f0(x) = 6−4cos(2x)−2cos(4x), νpre = νpost= 1, tolerance=10−7

n # iterationsTwo-grid W-cycle V-cycle

34−3 = 78 15 19 2835−3 = 240 15 20 3936−3 = 726 14 20 4537−3 = 2184 13 20 47

proposed in [5], even if the number of iterations is slightlyhigher. Anyway a directcomparison can not be easily done because of the different cost of each iteration, dueto the different size reduction, as discussed in Subsection5.3.

Page 20: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

20 Marco Donatelli et al.

Table 6.6 Toeplitz case.f0(x) =(2−2cos

(x− π

3

)), tolerance=10−7 .

n # iterationsV-cycle W-cycle

νpre = νpre = νpre = νpre =νpost= 1 νpost= 2 νpost= 1 νpost= 2

34−1 = 80 33 37 33 3735−1 = 242 30 31 30 3136−1 = 728 30 31 30 3137−1 = 2186 30 31 30 31

6.3 Some Toeplitz examples

In this subsection we consider only the more interesting case for practical applica-tions: Toeplitz matrices with a multigrid strategy.

The first example is a function with a zero not at the origin orπ :

f0(x) =(

2−2cos(

x− π3

))

which vanishes atx = π/3 with order 2. Moreover, we choose as true solution arandom vector instead of a smooth solution. The tolerance isagain 10−7. The symbolof the projector at the first level is

p0(x) = (2−2cos(x−π))

(2−2cos

(x− 5

3π))

.

At the lower levels, the symbols of the projectors will change according to the posi-tion of the zero off j , which remains unique and which moves as described in Propo-sition 4.1. Table 6.6 shows an optimal convergence both for the V-cycle and for theW-cycle.

In the second example, we consider the dense Toeplitz matrixgenerated by thefunction f (x) = x2, which has the Fourier series expansion

f (x) =π2

3−4

(cos(x)

12 − cos(2x)22 +

cos(3x)32 −·· ·

).

Such function shows a unique zero atx= 0 of order 2 and hence we use the projectorwith symbol

p0(x) =

(2−2cos

(x− 2

3π))(

2−2cos

(x− 4

3π))

.

In Table 6.7 we report the number of iterations required for the convergence, with apreassigned accuracy and we note again an optimal behavior.

Page 21: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

Multigrid methods for Toeplitz linear systems with different size reduction⋆ 21

Table 6.7 Toeplitz case.f (x) = x2

n # iterationsV-cycle W-cycle

νpre = νpre = νpre = νpre =νpost= 1 νpost= 2 νpost= 1 νpost= 2

34−1 = 80 21 11 21 1135−1 = 242 18 11 21 1136−1 = 728 18 11 21 1137−1 = 2186 18 11 21 11

7 Conclusions and future work

In this paper we have extended the rigorous two-grid analysis for circulant matricesto the case where the size reduction is performed by a factorg with g > 2. An inter-esting feature of a size reduction by a factorg > 2 is that it allows to eliminate somepathologies which occur wheng = 2. In particular, if the considered matrices comefrom the approximation of certain integro-differential equations, then we have twosources of ill-conditioning and the zeros of the underlyingsymbol are located atx= 0and atx = π : this situation is a special case of mirror point zeros and, wheng = 2, itis possible to prove that the resulting two-grid iteration cannot be optimal (see [10,25]). Such difficulty can be overcome using a block projector[5,16] or choosing alargerg. Moreover, when increasingg, the size of the coarse problems decreases: asa consequence more multigrid recursive calls could be considered, like the W-cyclewhich is proved to be optimal forg≥ 3.

We stress that the numerical experiments are encouraging, not only for circulantmatrices but also regarding Toeplitz matrices and concerning the use of the V-cyclealgorithm. A future line of research must include the multilevel setting, following theapproach in [25,1], and a rigorous proof of convergence for the whole V-cycle proce-dure in accordance with the proof technique introduced in [2]. The proposed conver-gence analysis could be then applied to smoothing aggregation projection techniques[18].

Acknowledgements We thank the anonymous reviewers for their constructive criticism which greatlycontributed in improving both the contents and the presentation of the paper.The work of all the authors has been partly supported by MIUR grant no. 20083KLJEZ.

References

1. Arico, A., Donatelli, M.: A V-cycle multigrid for multilevel matrix algebras: proof of optimality. Nu-mer. Math. 105-4, 511–547 (2007)

2. Arico, A., Donatelli, M., Serra-Capizzano, S.: V-cycleoptimal convergence for certain (multilevel)structured linear systems. SIAM J. Matrix Anal. Appl. 26, 186–214 (2004)

3. Bottcher, A., Grudsky, S.: On the condition numbers of large semi-definite Toeplitz matrices. LinearAlgebra Appl. 279, 285–301 (1998)

4. Chang, Q., Jin, X., Sun, H.: Convergence of the multigrid method for ill-conditioned block Toeplitzsystems. BIT. 41–1, 179–190 (2001)

Page 22: Multigrid methods for Toeplitz linear systems with ...scienze-como.uninsubria.it/mdonatelli/Publications/Articoli/mgmsize.pdf · multilevel case by tensor product arguments, by taking

22 Marco Donatelli et al.

5. Chan, R.H., Chang, Q.S., Sun, H.W.: Multigrid method for ill-conditioned symmetric Toeplitz systems.SIAM J. Sci. Comput. 19, 516–529 (1998)

6. Chan, R.H., Ng, M.: Conjugate gradient methods for Toeplitz systems. SIAM Rev. 38, 427–482 (1996)7. Codevico, G., Heinig, G., Van Barel, M.: A superfast solver for real symmetric Toeplitz systems using

real trigonometric transformations, Numerical Linear Algebra with Applications 12, 699–713 (2005)8. Davis, P.: Circulant Matrices. J. Wiley and Sons, New York(1979)9. Donatelli, M.: An algebraic generalization of local Fourier analysis for grid transfer operators in multi-

grid based on Toeplitz matrices. Numer. Linear Algebra Appl. 17-2/3, 179–197 (2010)10. Fiorentino, G., Serra-Capizzano, S.: Multigrid methods for Toeplitz matrices. Calcolo 28-3/4, 283–

305 (1991)11. Fiorentino, G., Serra-Capizzano, S.: Multigrid methods for symmetric positive definite block Toeplitz

matrices with nonnegative generating functions. SIAM J. Sci. Comput. 17, 1068–1081 (1996)12. G.H. Golub and C.F. Van Loan,Matrix Computations.The Johns Hopkins University Press, Baltimore

1983.13. Hackbush, W.: Multi-grid Methods and Applications. Springer-Verlag, New York (1979)14. Huckle, T.: Compact Fourier analysis for designing multigrid methods. SIAM J. Sci. Comput. 31-1,

644–666 (2008)15. Huckle, T., Staudacher, J.: Multigrid preconditioningand Toeplitz matrices. Electr. Trans. Numer.

Anal. 13, 81–105 (2002)16. Huckle, T., Staudacher, J.: Multigrid methods for blocktoeplitz matrices with small blocks. BIT. 46–1,

61–83 (2006)17. Kalouptsidis, N., Carayannis, G., Manolakis, D.: Fast algorithms for block Toeplitz matrices with

Toeplitz entries. Signal Process. 6, 77–81 (1984)18. Napov, A., Notay, Y.: Algebraic analysis of aggregation-based multigrid, Numer. Lin. Alg. Appl.. To

appear (2011)19. Napov, A., Notay, Y.: Comparison of bounds for V-cycle multigrid. Appl. Numer. Math. 60, 176–192

(2010)20. Ngondiep, E., Serra-Capizzano, S., Sesana, D.: Spectral features and asymptotic properties forg-

circulants andg-Toeplitz sequences. SIAM J. Matrix Anal. Appl. 31–4, 1663–1687 (2010)21. Noutsos, D., Serra-Capizzano, S., Vassalos, P.: Matrixalgebra preconditioners for multilevel Toeplitz

systems do not insure optimal convergence rate. Theoret. Computer Sci. 315, 557–579 (2004). In col-laborazione con D.s.

22. Ruge, J.W., Stuben, K.: Algebraic multigrid. In Multigrid Methods, S. McCormick, ed., FrontiersAppl. Math. 3, SIAM, Philadelphia, 73–130 (1987)

23. Serra-Capizzano, S.: On the extreme eigenvalues of Hermitian (block) Toeplitz matrices. Linear Al-gebra Appl. 270, 109–129 (1998)

24. Serra-Capizzano, S.: Multi-iterative methods. Comput. Math. Appl. 26, 65–87 (1993)25. Serra-Capizzano, S.: Convergence analysis of two-gridmethods for elliptic Toeplitz and PDEs matrix-

sequences. Numer. Math. 92, 433–465 (2002)26. Serra-Capizzano, S.: Matrix algebra preconditioners for multilevel Toeplitz matrices are not superlin-

ear. Linear Algebra Appl. 343/344, 303–319 (2002).27. Serra-Capizzano, S., Tablino Possio, C.: Multigrid methods for multilevel circulant matrices. SIAM

J. Sci. Comput. 26-1, 55–85 (2004)28. Serra Capizzano, S., Tyrtyshnikov, E.: Any circulant-like preconditioner for multilevel matrices is not

superlinear. SIAM J. Matrix Anal. Appl., 21–2, 431–439 (1999)29. Sun, H., Chan, R. H., Chang, Q.: A note on the convergence of the two-grid method for Toeplitz

matrices. Comput. Math. Appl. 34, 11–18 (1997)30. Trench, W.: Properties of unilevel block circulants. Linear Algebra Appl. 430, 2012–2025 (2009)31. Trottenberg, U., Oosterlee, C.W., Schuller, A.: Multigrid. Academic Press, London (2001)32. Woedermann, H.: Estimates of inverses of multivariableToeplitz matrices. Operators and Matrices 2,

507–515 (2008).