---l-- - princeton university unwise to proceed beyond the hartree-fock method with a poor basis...

CHAPTER 17

PSEUDOSPECTRAL METHODS APPLIED TO THEELECTRON CORRELATION PROBLEM

Todd J. Martinez and Emily A. Carter

Department of Chemistry and Biochemistry,University of California, Los Angeles, California 90024-1569, USA

Contents

1. Introduction2. Conventional Correlation Methods-Overview3. PseudospectralMethod- Motivation and Theory4. Conventional Mf/lller-PlessetPerturbation Theory5. Pseudospectral Mf/lller-PlessetPerturbation Theory6. ConclusionAcknowledgmentsReferences

11321134113611461152116211631163

1. Introduction

The Hartree-Fock approximation is of central importance to quantumchemistry. The assumption of electrons interacting with each other througha mean field sim;>lifiesthe electronic structure problem considerably, andprovides the starting point for almost all approaches. However,it is curiousto note that even this simplifying assumption was not sufficient to makeelectronic structue calculations on polyatomic moleculesa viable proposi-tion. The Hartrce-Fock equations are complicated integro-differentialequa-tions, and the finite basis set approximation must be introduced to proceed

1132

---L--

P.eudo.pedrol Method. Applied to . . . 1133

comfortably past diatomic molecules. The use of this approximation hasbeen shown empirically to be quite welljustified, particularly if one is notinterested in computing accurate total energies, but rather energy differ-ences and molecular properties. Unfortunately, the Hartree-Fock approx-imation, while providing a good zeroth order description of equilibriumproperties, is incapable of quantitative accuracy. Since the Hartree-Fockwave function is exact only in the limit of non-interacting electrons, thisshould not be surprising. Many failures can be cited here, with perhapsthe most well-known being the failure of the Hartree-Fock wave functionto allow proper dissociation of a closed-shellspecies into radical fragments.The remedy is to relax the restriction that the wavefunction be expressibleas a single Slater determinant, and instead allow a superposition of severalSlater determinants. However,even in cases where the single-determinantwave function is not grossly inadequate, one finds that the mean field de-scription of the electrons is a serious obstacle to quantitative accuracy. Ofcourse, the motion of electrons is correlated, and they tend to stay as farapart as possible to minimize the repulsion between them. Such behavior isby definition not incorporated in the Hartree-Fock picture, and hence anyprocess that involves orbital deformation will lead to a non-constancy ofthe correlation energy and thus errors in energy differencescomputed usingHartree-Fock techniques. Therefore, correlation effects are less importantin computing energy differencesinvolvingslight configurationdeformations,such as low energy vibrations, than they are when bonds are broken. Thisis not to say that correlation effects are unimportant in the former case-this all depends on the accuracy one seeks to achieve and the particularsystem being studied. However,it is a good rule of thumb that correlationeffects will always be important when computing energy differencesinvolv-ing species with significantly different electronic structure. As chemists, weare perhaps most interested in the bond breaking and formation that occursin chemical reactions, which by its very nature implies significant changesin electronic structure. Therefore we cannot ignore correlation effects.

Nevertheless, the computation of such effects is not at all trivial. Thebarrier lies not so much in the development of the formal theory or its im-plementation, but rather in the sheer quantity of computer time requiredto obtain results. Even the cheapest of all correlation methods scales withthe quintic power of the number of basis functions used, which is in turnroughly proportional to the number of atoms in the moleculebeing studied.

1134 Modem Electronic Structure Theory P.eudo.pectrol Method. Applied to . .. 1135

I'

What is more, it has become quite apparent in the last ten years that itis unwise to proceed beyond the Hartree-Fock method with a poor basisset, since correlation effects require a flexible basis for reliable description.Thus the dilemma facing the modem quantum chemist is how to computecorrelation corrections in a reasonable amount of time without sacrificingaccuracy. The problem is particularly pressing as quantum chemists strug-gle to describe large systems, for example cluster models of surfaces, poly-mers, or biologically significant molecules. Certainly it is desirable to beable to calculate correlated quantum mechanical wavefunctions for systemscontaining more than tens of atoms, but this is almost impossible today.Even with the cheapest of 8ll correlation methods, the largest moleculethathas been treated with a reasonably sized basis set is Cso, a molecule whosehigh symmetry provides tremendous simplification.1

In this review, we will concentrate on our efforts to solve the correlationproblem by using the pseudospectral method, originally introduced in fluiddynamics,2 to compute correlation energies. We will not review the detailsof pseudospectral Hartree-Fock at all, and refer the readers to the work ofFriesner and coworkersfor details.3 We will spend a considerable amount oftime analyzing the pseudospectral method, and we hope the reader will findour exposition sheds new light on the subject. In particular, we attemptto put the pseudospectral method in historical perspective by comparingit to various other approaches which are somewhat similar in structure orphilosophy. We will explain why the pseudospectral method was successfulwhere other methods failed, and we will use our recent treatment of second-and third-order MfJller-Plesset perturbation theory as a case study. .

a contraction of the wave function coefficientswith integrals of basis func-tions. Generally, there is a one-electron part which is easy to handle sincethe integrals involveonly two basis functions and a two-electron part whichdominates the computation. The required two-electron integrals representthe electron-electron repulsion term in the Hamiltonian, and are given interms of spectral basis functions V'jas

(ablcd) =JV',.(r1}V'b(r1}V'c(r2}V'd(r2}

Ir1 - r21 dr1dr2.(I)

A typical contraction of these integrals with wave function coefficientswilllook similar to

~)ablcd}Ccd ,cd

(2a)

or

~:::Caclbd}CCd .cd

(2b)

2. Conventional Correlation Methods-Overview

Conventional ab initio treatments of electron correlation almost invariablyexpand the wave function in terms of a linear combination of determinantsexcited with respect. to some self-consistent field reference state. The dif-ferences between the various methods lie in the means of solving for thecoefficientsin the expansion. Except for the lowest order of perturbationtheory, all efficient methods require storage of the wave function coeffi-cients in high-speed memory. This requirement constitutes the first of thetwo basic bottlenecks in correlation treatments. Optimization of the coef-ficients always involvescomputing the result of operation on a trial wavefunction with a Hamiltonian operator. This can be expressed in terms of

The first of these represents a generalized Coulombic interaction of chargedensities, and the second represents a generalized exchange interaction,arising from the antisymmetry of the wave function. The computation ofthese contractions forms the second major bottleneck in correlation treat-ments, and it is this bottleneck which the pseudospectral method addresses.It is important to keep in mind that the first bottleneck will eventually be-come problematic in any of the conventional correlation treatments, andone must turn to other than pseudospectral techniques to alleviate suchproblems.

Configuration interaction treatments optimize the wave function coef-ficients according to a variational principle, and modern treatments useiterative diagonalization methods which do not require the storage of theHamiltonian matrix. We have applied the pseudospectral approximationto full configuration interaction4 (FCI) and double-excitation configurationinteraction5 (DCI). Full CI is the method obtained when all possible config-urations arising fromexcitations within the chosenbasis set are used, and itis of special importance in quantum chemistry because it provides the exactanswer within the finite basis set and Born-Oppenheimer approximations.The fundamental difficulty in FCI is the first bottleneck mentioned above.The pseudospectral method allows for a formal scaling advantage, but at

I lI

-L

1136 Modern Electronic Structure Theory

the expense of a larger prefactor. The exponential growth of the numberof wave function coefficientsfor a given basis set size makes it impossibleto perform FCI calculations with basis sets large enough to overcome thelarger prefactor and thus the advantage is n~t practically attainable. Onthe other hand, the pseudospectral formulation of the DCI method is quitesuccessful because the growth of the number of coefficientsis only quarticwith respect to basis set size.

Perturbation theoretical approaches are quite attractive because theydo not require iteration. Since one generally requires about ten iterati0!lsin a variational procedure, this givesa significant advantage to perturbationtheory. In principle, storage of the wave function is not required, but ef-ficiency dictates that any treatment which progresses beyond second-ordermust do so. Unfortunately, any treatment which does not use canonicalorbitals (that is, orbitals that are eigenfunctions of the zeroth-order Hamil-tonian) must use iteration in some form. Such treatments offer their ownadvantages, particularly for addressing the problem of storing the wavefunction, and further discussion of this point is deferred until later. Afurther advantage in principle of perturbation theory is the fact that it issize-consistent, meaning that it predicts the energy of n non-interactingsubsystems to be n times the energy computed for one of the subsystems.Truncated configuration interaction approaches behave incorrectly in thisregard, but they fur-nisha rigorous upper bound to the true energy, whileperturbation theory does not. We have formulated and implemented6 apseudospectral M",ller-Plessetperturbation theory, and the results are quitepromising.

3. Pseudospectral Method - Motivation and Theory

The pseudospectral method provides an alternative way of dealing with thetwo-electron integrals. As discussed previously,these integrals form one ofthe basic obstacles to the calculation of correlated wave functions for largemolecules. In order w make this clear from the outset, we focus attentionon the representative term given in Eq. (2a). Evaluation of this term willrequire O(N4) floating-point operations, that is, N2 operations (the sumover c and d) for ea.chof N2 terms (the remaining indices of a and b).It is the fact that the two-electron integral is a four-index quantity whichcauses problems. If it were possible to factorize the integral into a productform, the effort could be reduced through the formation of intermediate

~

P.eudo.pectral Method. Applied to ... 1131

quantities. In fact, this is precisely the line of attack used by the family of"three-center integral approximation" methods. Thus, the aim is to write afactorization like

N"

(ablcd) = L Bab,.Acd,. .,.=1

(3)

Given such a factorization, Eq. (2a) can be rewritten as

L Bab,.L Acd,.Ccd .,. cd

(4)I

IIIIIIIIIII

The operation count now becomes 2N2N,., which is much better if N,.can be chosen to be significantly smaller than N2. For the benefit of theuninitiated, we provide code fragments which implement Eq. (2a) withand without the factorization of Eq. (3) in Figs. 1 and 2. These are notmeant to be efficient implementations, but rather are provided to clarifythe arguments concerning scaling and operation counts.

At this point, the question is howsuch a factorization can be introducedformally. One way to do this is to use the Choleskydecomposition, as orig-inally suggested by Beebe and Linderberg.7 The Choleskydecomposition iswell-knownamong numerical analysts, and provides a means of decompos-ing a positive-definite matrix into the product of two triangular matricesas follows

I = LLT . (5)

The matrix L is lower-triangular of the same dimensions as I, and thesuperscript T denotes matrix transposition. If we associate I with the two-electron integral matrix as Ipq,ro= (pqlrs), we can apply the techniqueused in Eq. (3). This is not useful as an identity, since N,. (the innerdimension of the matrix L) is exactly equal to N2. However, if there ex-ists approximate numerical linear dependence among the columns of thematrix I, then one can delete some of the columns of the matrix L, andthus reduce N,.. In practice, reasonably large basis sets will give rise toa significant amount of approximate linear dependence in I, and thus thesavings. Furthermore, the approximation is wellcontrolled in the sense thatthe maximum error in any given integral can be deduced from the deletedcolumns of L. The primary problem with the method is that the num-ber of operations required to compute the Cholesky decomposition is N6.

1138 Modem Eledronic Structure Theo'1l

II

\iI

do ia=l,ndo ib=l,n

8um(ia,ib)=O.Odo ic=l,n

do id=l,n8um(ia,ib)=

8um(ia,ib)+twoe(ia,ib,ic,id)*c(ic,id)

enddoenddo

enddo

enddo

Fig. 1. Pseudocodeimplementationof Eq. (2a) without the factorizationof Eq. (3).The operation count is determined by the number of multiplies required to complete thecode fragment. The array twoe(ia,ib,ic,id) corresponds to the integral (ablcd) in Eq. (1).

1\

II!

I,

do ia=l,ndo ib=l,n

8um(ia,ib)=O.Oenddo

enddo

do kappa=l,nkappatemp=O.Odo ic=l,n

do id=l,n

temp=temp+a(ic,id,kappa)*c(ic,id)enddo

enddodo ia=l,n

do ib=l,nsum(ia,ib)=sum(ia,ib)+b(ia,ib,kappa)*temp

enddo

enddcenddo

Fig. 2. Pseudocode implementationof Eq. (2a) using the factorizationof Eq. (3).Noticethat the number of multiplicationsis reducedfrom N4 to 2N2N".

~

P.eudo.pectral Method. Applied to . . . 1139

This can be reduced to O(N2 N~) at the cost of some overhead for rear-ranging the elements of I. Unfortunately, N.. is not known until the de-composition is complete, since it is specified by a predetermined tolerancefor the accuracy of the two-electron integrals. The method was originallyadvocated for use in the integral transformation, which requires O(N5) op-erations. Thus, the decomposition could cost more than the conventionaltransformation, and it is this uncertainty which is likely responsible forthe neglect of the method. We have examined the possibility of using themethod of FCI,8 where the cost of the decomposition is negligible com-pared to the subsequent CI calculation. This does not work well becausethe approximate linear dependence of the integrals is only significant forbasis sets which are too large for FCI calculations. In other words, thefirst bottleneck described above is again far too constraining to allow anyadvantage.

7A second way to achieve the factorization of Eq. (3) is to introducea truncated resolution of the identity (RI). Thus, we identify B and A asfollows

Bab..= / fPa(r)fPb(r)II:(r)dr,(6a)

_ / II:(rdfPc(r2)fPd(r2) drldr2.Acd..- Irl - r21

(6b)

Feyereisen et al. have suggested using such a resolution for second-orderMl1Iller-Plessetperturbation theory, where the functions II:are taken to beorthonormalized combinations of the primitive Gaussian functions used toform the atomic orbital basis fP.DThis method has been christened RI-MP2.Surprisingly good results are obtained provided the underlying basis is ofhigh quality. For example, Feyereisen et al. report an error of slightly lessthan a millihartree in the calculation of the MP2 energy for water using acorrelation-consistent triple-zeta10 (CCTZ) basis set.

We have only mentioned the calculation of Eq. (2a) with the Beebe-Linderberg or RI methods. If a sum such as that of Eq. (2b) is required,both of these methods fail to produce any advantage, since the decom-positions are both inseparable in the charge distributions. An alternativemethod is to use a truncated resolution of the identity with delta functionsin physical space as the expansion basis. This is simply a fancy way of

1140 Modem Electronic Structure Theory P.eudo.pectrol Method. Applied to .. . 1141

I

: I,II,I

stating that we wish to consider a quadrature method, where the Coulombpotential is to be evaluated analytically on a physical space grid. This lastpoint is of the utmost importance in practice, as numerical integration overthe Coulomb singularity would require a tremendous density of grid pointsto achieve accurate results. The quadrature approximation is written as

~ L CPa(rg)cpb(rg)J cpc(r)cpd(r) dr

9 Ir-rg\ '

(7)

of grid points required is. large-typical density-functional codes use0(104) points per atom. If the pseudospectral method required such densegrids, the scaling advantage would be seen14only in calculations involvingmore than 40 heavy atoms (assuming the use of a valencedouble-zeta pluspolarization basis set). Such calculations are possible using conventionalmethods, but are by no means routine. Therefore the applicability of themethod would be quite limited.

Clearly, an efficient quadrature-based approach to ab initio methodswill need to be able to use rather sparse grids. The pseudospectral methodallows this by providing a new perspective on the quadrature procedure.To see this, it is instructive to rewrite Eq. (8) in matrix notation as follows

(ablcd) =JCPa(r1 )CPb(r1 )cpc(r2)CPd(r2) d

Ir1 - r21 r1dr2

where 9 indexes grid points whose locations are given by rg. The summa-tion over the grid points results from the integration over the expansionfunctions, c5(r- rg). This allows separation of the ab charge density, andtherefore we can apply the factorization fruitfully to Eq. (2b) as follows:

lab,cd = RT ACdR, (9)

Rga = CPa(rg), (10)

l)aclbd)Cbd ~ 1: CPa(rg) (1: Abd(rg) (1: cpc(rg)Ccd))cd 9 d c (8a)

A~~gl = c5g,g,Acd(rg). (11)

Abd(rg)= J cpj(r)CPd(.r)dr. (8b)r-rg

The matrix R clearly describes a transformation from spectral space, thatis, function space, to physical space, i.e., "grid space." It is thus plausibleto interpret Eq. (9) as an operator transformation, which would imply thatthe matrix RT should describe the transformation from physical space tospectral space. The pseudospectral prescription is therefore that RT bereplaced by R-1. Note that if we choose R to be orthogonal, RT = R-1,and the pseudospectral method yields the same equations as the quadratureprescription. However, there is already an important difference becausethe restriction that R be orthogonal removes the complete freedom in gridpoint placement which is present in the general quadrature scheme. Thisreminds one of Gaussian quadrature, where higher accuracy (relative to,e.g., Newton-Cotes quadrature15) is obtained by judicious placement ofgrid points.16.17

At this point it is appropriate to make a connection with the discretevariable representation (DVR) method of Light and coworkers,18which is atype of pseudospectral method. They choose the grid points as eigenvaluesof the spectral representation, which they call the finite basis representa-tion (FBR), of the coordinate operator. The transformation matrix R isthen given by the matrix of eigenvectors. The fundamental equations of

By storing the results of the summations in parentheses into intermediatearrays, this costs 0(MN2) floating-point operations, where M denotes thenumber of grid points used. Similar approaches have been tried in the past,primarily in connection with the Hartree-Fock procedure,u,12 Quadraturemethods have also been used to evaluate the integrals directly when thebasis functions are Slater-type orbitals.13

At this point it should be quite clear to the reader how a scaling ad-vantage can be obtained through the use of an integral factorization. Anobvious improvement of Eq. (8) is the use of a different weight for each gridpoint. The probl.~mwhich remains is that obtaining chemical accuracy us-ing Eq. (8) requires an inordinate number of grid points. This is exactlythe procedure commonly used in density-functional theory codes, wherethe complicated nature of the exchange-correlation functional precludesthe analytic evaluation of the required integrals. However, the number

1142 Modem Electronic Structure TheDrv

the DVR are therefore

XDVR =RXFBRRT , (12)

Xf.BR= f tpi(x)tp;(x)xdx,IJ ,

(13)

(yDVR) .. _ c.. V(X

DVR)I' - UI, ii , (14)

1;,2

f(TFBR)i; =__ tpi(r)V2tp;(r)dr,2m(15)

HDVR= RTFBRRT + yDVR. (16)

Here it is understood that XDVRis a diagonal matrix, rendering R uni-tary in general and orthogonal if the spectral basis functions are real.Equations (14)-(16) make reference to the decomposition of the Hamil-tonian matrix H in terms of kinetic (T) and potential (Y) energy compo-nents. Of particular interest in our development is the £act that the kineticenergy operator is represented in a spectral basis while the potential energyoperator is represented in physical space. This is natural because the ki-netic energy operator involves differentiation which is most convenientlyand accurately carried out on a manifold of continuous functions, whilethe potential energy operator is typically local in physical space and hencemost efficientlyevaluated in physical space. The strength of pseudospectralmethods is precisely the recognition of this fact, that is, neither spectralnor physical space alone can provide an efficient method for solving theelectronic structure problem.

In order to obtain chemical accuracy in solving the electronic structure

problem using pseudospectral methods, we must proceed a step further andask where the error in the method comes from. Perhaps it is clear that, ifthe grid is sufficientlydense, the individual integrals as expressed by Eq. (9)with R-1 in place of RT will be arbitrarily dose to the values obtainedusing Eq. (1). However, as stated up to this point, the pseudospectralmethod is defined only when R-1 exists, mandating the use of equal num-bers of basis functions and grid points. Indeed, we could simply increaseboth the numbers of basis functions and grid points in order to garner more

t......

P.eudo.pectrol Melhod. Applied to ... 1143

accuracy. However,we wish to introduce the pseudospectral method as ameans to approximate conventional spectral results. Such a desire mightseem rather peculiar, as it is tantamount to saying we desire the pseudo-spectral method to be a good approximation to another approximation.(Recall that conventional spectral methods are not exact.) Yet it is crucialthat this be the case for two reasons. First, there is an abundance of expe-rience and literature concerning the accuracy which can be expected usingvarious basis sets in conventional methods. Seconqly,spectral methods, asused in electronic structure theory, have an uncanny ability to provide ac-curate values for energy differenceswhich have errors that are much smallerthan those of the absolute total energies. This is due to a balanced treat-ment of moleculesat differentgeometries, which is in turn a consequenceofthe global nature of the basis functions used. Numerical methods generallyfind it difficult to reproduce this property. A method which is designedto mimic conventional methods allowsall the literature on basis sets to be

carried over from the literature, as well as approximating the cancellationof error which has made quantum chemistry a viable proposition.

In order to increase the size of the physical space grid without alteringthe number of basis functions used, we turn to the idea of a generalizedinverse, Q.19,20Thus,

Q = S(RTwR)-lRTw, (17)

where S is the analytic overlap matrix between spectral basis functions, andw is the diagonal matrix of weights mentioned earlier. 'Note that w mustbe such as not to render RTwR singular. This expression is applicable forrectangular R, and provides the best transformation from physical spaceto spectral space in the least-squares sense. The overlap matrix is in theexpression because it provides the proper metric for the spectral space, butit is most easily understood by realizing that the analogous quadratureprescription for the overlap matrix element is

Si; :::::L RgiRg; = (RTR)i; ,g

(18)

which is given as

Si; :::::(QR)i;, (19)

after replacement of RT with the generalized inverse matrix Q. Use ofEq. (17) for Q dearly satisfies Eq. (19), which should be so given our

1144 Modem Electronic Structure TheoryP.eudo.pectral Method. Applied to . . . 1145

N 00

(QAabRc)d= EQd9Ecilf'i(r9)+E ciEQd9If'ilia8(rg) (22)9 i=l i=N+l 9

where RC represents the cth column of R. The first term is the exact

integral (ablcd), while the second term is completely spurious. Notice that

the innermost sum in the second term is an overlap integral representedon the grid. Since we have defined the alias as being orthogonal to thespectral basis functions, the entire second term should vanish. In the limitof an infinitely dense grid, the grid represents these overlaps perfectly andthe adverse effects of aliasing disappear. However, if the grid is sparse thesecond term will not vanish. There are two solutions. First, we can expandthe grid such that it better represents the overlap between basis functionsand the dominant components of the alias. Alternatively, wecan include thedominant components of the alias in the formation of Q, projecting themexactly. This is equivalent to the oversampling and subsequent filteringmentioned above and is accomplishedby using functions orthogonal to thebasis as dealiasing functions and interpreting R and RT in Eq. (17) asmatrices of dimension (M x NB +ND) and (NB + ND x M) respectively,where M, NBand ND are the numbers of grid points, basis functions anddealiasing functions. The overlap matrix is formally (NB+ NDXNB+ ND),but since we will never want integrals involvingdealiasing functions, weonlyneed the upper (NB XNB + ND) block. In the limit of exact dealiasing, thepseudospectral representation of the integrals will be exact, but one shouldkeep in mind that the number of grid points must be at least as large asNB + N D. Note that the number of dealiasing functions used only affectsthe cost of forming Q provided one does not use so many as to requireincreasing the grid size.

The issue of which functions to include as dealiasing functions is mosteasily addressed by realizing that they should bring the basis as close tocompleteness as possible. Therefore, one can restrict the dealiasing func-tions to be atom-centered Cartesian Gaussians and use standard quan-tum chemical methods for extending basis sets. This has the advantage ofallowing the dealiasing functions to be treated on the same footing as thebasis functions themselves in the formation of R and S needed for Q, andis the approach currently used. For most correlated methods, the effortto form Q is negligible compared to the remainder of the calculation, sothere is incentive to use better dealiasing sets. We are investigating thepossibility of using more and/or better dealiasing functions at this time.23

We conclude this section by summarizing the r~lationships between thevarious approaches presented. The Beebe-Linderberg and RI methods havethe same goals as the pseudospectral method, but suffer from the inabilityto provide any advantage when treating exchange-type sums, as typified by

III

development. From a quadrature perspective, the use of Eq. (17) is equiv-alent to allowing the weighting function to be different for each basisfunction.

The final fundamental aspect of the pseudospeciral method whichremains to be discussed is the use of dealiasing functions. Since the

pseudospectral method involves the transformation between two incom-plete spaces, errors analogous to those called aliasing in discrete Fouriertransform (DFT) theory21 will appear. In particular, if a function whichis not bandwidth-limited is subject to a DFT on a grid which does !lotsupport sufficiently high frequencies, the high frequency components willproject onto lowfrequency components and contaminate the spectrum. Theproblem arises because the high frequency functions are indistinguishablefrom certain low frequency functions when represented on the grid. In thecontext of signal processing,22this is solved by oversampling and filtering.The same problem will arise when using pseudospectral methods to solvethe electronic structure problem, but the solution is not as clear since thebasis functions used are more complicated (atom-centered Cartesian Gaus-sians) than those used in the DFT (sines and cosines). Nevertheless, thegeneral approach is the same-oversample and filter prior to the back-transformation.

Perhaps we can make this more clear by writing the following:N 00

Aab(r)lf'c(r) = E Cilf'i(r)+ E cilf'iliaa(r), (20)i=l i=N+l

where If'i1iaacomprise a set of functions orthogonal to the basis which bringit to completeness. Thus, Eq. (20) is formally exact, since the basis iscomplete. Introdudng the grid, we obtain

N 00

Aab(r'I)If'C(rg)= E Cilf'i(rg)+ E cilf'ilias(rg). (21)i=l i=N+l

If we now proceed to transform to spectral space using Q, the problembecomes clear:

1146 Modem Eledronic Structure TheorvP.evdo.pectral Metho41 Applied to . .. 1147

Eq. (2b). The DVR is a pseudospectral method, but differsin implementa.-tion. The DVR prescription for R does not allow the freedom to add extra

grid points in order to increase accuracy, 8:Ddalso fails to recognize the im-portance of the aliasing problem. The DVR is usually implemented within adirect-product basis, which would be a serious disadvantage for electronicstructure methods. The pseudospectral method as used by Friesner andourselves views the spectral representation as primary, while implementa-tions of the DVR tend to view the physical (grid) space representation as

primary. This is a matter of taste, but is important given that the pseu4°-spectral electronic structure theory methods are trying to mimic spectralresults.

The Hartree-Fock energy is recovered as the sum of the zeroth- and first-order energies obtained by a Rayleigh-Schrooinger perturbation series.Provided Ho is diagonal, the general expression for the second order en-ergy in Rayleigh-Schrooinger perturbation theory is given by

E(2) = L: J(1/10IV1n}rn~O(1/1oIHol1/1o)- (nIHoln),

(25)

I'

4. Conventional M_Uer-Plesset Perturbation Theory

M{I)ller-Plesset(MP) perturbation theory is the result of applying the tradi-tional Rayleigh-Schrooinger perturbation formulae to electronic structuretheory, using the sum of Fock operators as the zeroth-order Hamiltonian.We will restrict our discussion to closed-shellmoleculesfor several reasons.First, there is reason to believe the perturbation expansion is only reliablewhen the zeroth-order reference is a pure eigenstate of spin.24 The use ofunrestricted Hartree-Fock (UHF) followedby MP is therefore questionable,

particularly far from the equilibrium geometry where spin contaminationof the UHF determinant is most severe. Secondly, although there do ex-ist restricted open-shellHartree-Fock2r.-27(ROHF) and multi-reference28,29generalizations of MP, they are still quite new and not thoroughly tested.Finally, the formulae are much simpler for the closed-shellcase.

The usual Fock operator for a closed-shellsystem is

where n runs over all configuration state functions (CSF)30except the ref-erence CSF 1/10'Since the Hamiltonian operator involves no higher thantwo-particle terms, the matrix element in the numerator will be zero if theCSF n differs from the referenceCSF by more than a double excitation. Wecan restrict the expansion to doubly-excited CSF's in the usual case thatthe reference CSF was determined in a manner consistent with Brillouin'stheorem.

The precise form of the CSF's is still open, and we introduce thegenerator-state choice as it will be convenient later in the development.The generator-states are CSF's obtained from the reference through theaction of the individual terms in the Hamiltonian. These CSF's are not the

usual ones used in correlation treatments, but they have certain computa.-tional advantages, as demonstrated by Pulay, Saebl/land Meyer.31We canwrite the Hamiltonian operator in second-quantized form as

III

II

I I

I,

H=L: a!a;(ilhlj)+L:(l + 6k,)-1a!ajaka,(ijlkl), (26)i,; k~1

i;

n

F(i) = h(i) +L:2J/(i) - k,(i) , (23)1=1

where i and I refer to electrons and occupied orbitals, respectively. Asusual, h is the one-electron part of the Hamiltonian, and J and .k arethe Coulomb and exchange operators, respectively. The M",ller-Plessetpartitioning is then

v=H- fIo . (24b)

where a! and a; are spin-orbital creation and annihilation operators, re-spectively. From this point on in our discussion, we need to distinguishbetween occupied (internal) and virtual (external) molecular orbitals. Weadopt the convention that i, j, k, I denote' occupied orbitals, at b, C,ddenote virtual orbitals, and p, q, r, s denote atomic orbitals. For the pur-poses of operation counts and scaling analyses, n will denote the numberof occupied orbitals, Nv the number of virtual orbitals, and N the numberof atomic orbitals. We will express a determinant excited with respect tothe reference determinant as 4>'::I meaning an electron in occupied spin-orbital io: has been excited to the spin-orbital aa:. Applying the terms in

fIo = L:F(i),i

(24a)

1148 Modem Electronic Structure Theory

the Hamiltonian to the reference CSF and making spin explicit, we obtain

.1,0 _ ~ -t -. .1,_ A..0QA..0fJ""i - L.-J ao"a",.,.,o- ""iQ + ""ifJ '

"=Q,fJ

(27a)

.I,ob ~ (1J;;

)-l-t -t - - .1.

.,.,ij = L.-J + Uij ao"ab",ai"aj"'''''ot1.a'=o,/3

= (1+ 6.. )-1 (A..~Q~Q + A..~Q~fJ + A..~fJ~Q + A..~fJbfJ)" ""'Q'Q "",Q,fJ ""'fJ,Q ""'fJ,fJ . (27~)

Notice that the internal orbital indices are restricted such that i ~ j, whilethere is no such restriction on the external orbital indices. This is consistentwith the formof the Hamiltonianin Eq. (26),andleadsto twoCSF'sforthecase where twoelectronsare excitedfroma doubly-occupiedorbital intodifferent virtual orbitals. In fact, there is only one linearly independentCSF for this case, and this points out the fact that the generator-statebasis can have linear dependencies among the CSF's. As long as one iscareful to treat the two linearly dependent CSF's on an equal footing whenoptimizing their coefficients,this poses no problem for the closed-shellsingledeterminant reference. This linear dependency has its advantages in leadingto square matrices instead of triangular matrices. The former is wellsuitedto modern computer architectures, even though it represents an increase inthe numberof operationsby almosta factorof two.

From this point on, we will ignore the singly-excited CSF's, as these arenot necessary for MP theory before fourth order. Apart from exhibitinglinear dependency, the generator-state CSF's are also not orthonormal. Thedeviation from orthonormality is not very severe, and is given by

(1/Jfjl1/Jfj)= 2(2 - 6ab) ,(28a)

(1/Jfjl1/J~j) = 2(26ab - 1). (28b)

In order to compcn:,ate for the partial non-orthogonality of the CSF's,Pulay, Sacb0, and Meyer used a bi-orthogonal set, that is, they introduceda reciprocal basis (~I such that

(~fjl1/J'k1) = 6ik6j/6ac6bd . (29)

~

P.eudo.pectrol Method. Applied to ... 1149

The explicit expansion for the reciprocal basis CSF's is then

.hob_ ! (A..OQbQ A..0fJbfJ 2A..oQbfJ 2A..afJbQ A..bQofJ A..bfJOQ) (30)""ij - 6 ""iQjQ+ ""ifJjfJ+ ""iQjfJ+ ""ifJjQ+ ""iQjfJ+ ""ifJjQ ,

and the reader can easily verify that this satisfies the bi-orthogonality re-quirement. Now we can use the generator-state CSF's to reduce the sumin Eq. (25) down to the spatial orbital level. The final expression is

E(2) =L 2(ialjb)(ialjb) - (ialjb)(iblja) ... L €i + €j - €o - €b"ou(31)

As expected,this is the standardrecipeforthe MP2energyfoundin manytextbooks.

We have alluded previously to the possibility of formulating MP in abasis where iIo is not diagonal. In order to show this, we present the equa-tions for Rayleigh-Schrooinger perturbation theory through third order ina somewhat obscure form, which is nevertheless quite easy to derive32:

,p(1) = -(iIo - Eo)-lY1/1o, (32)

E(2) = (1/J(1)WI1/Jo) , (33)

E(3) = (1/J(I)IiI - Eo - E111/Jo + 1/1(1») . (34)

These equations do not require that iIo be diagonal, although it should beclear that the matrix inversein Eq. (32) is trivial if it is diagonal. Sincethe dimension of the matrix (iIo - Eo) is equal to the number of CSF's,explicit formation of the matrix inverse would take a prohibitive amountof computer time. Recall that matrix inversion is cubic in the dimensionof the matrix, and the number of CSF's is roughly n2N2, making explicitinversion an O(n6N6) process.33

Fortunately, it is only the action of the inverseon a known vector whichwe require. Therefore, we can use iterative techniques based on the fun-damental matrix-vector multiplication and solve the problem without eversolving for the inverse explicitly. In particular, we left-multiply both sidesof Eq. (32) by (iIo - Eo) to obtain

(iIo - Eo)1/J(I) = -V1/Jo, (35)

1150 Modem Electronic Struc:ture Tl&eo'1l

T =AxCi)- b, (36)

1151

For future use, we introduce the analogous generalized Coulomb operator

(J'j)olt = (ijlab). (40b)

The only requirement on the virtual orbitals in Eq. (38) is that they spana space which is orthogonal to the Occupied orbitals. If the method is

to be used without further approximation, this can be achieved by usingatomic orbitals in place of virtual orbitals and subsequently left- and right-multiplying the residual by a projection operator which removes the con-tribution of the occupied orbitals as

P.eudo.pectral Met1&ou Applied to . . .,I

II

which has the form Ax =b when given a matrix realization by expandingin a basis. This is a standard problem in linear algebra, and iterativemethods can be used. Such methods are invariably based on the definitionof a residual

which is zero when XCi),the ith guessed solution, solves Ax = b exactly.Given an approximate inverse of A, A-I, we can improve a guess at x by

I IXCi+l) =XCi) _ A-IT. 137)

The origin of this updating formula is evident when one realizes that it pro-vides the exact solution x in one iteration given A = A. Further iterationis required because A-I is only an approximate inverse of A. The proce-dure is very efficient provided convergence is achieved in a few iterationsand Ax =b needs to be solved for only a few right-hand sides b. Thereexist a number of schemes for accelerating convergencewhich improve theupdating prescription of Eq. (37).34

In order to implement the iterative inversionscheme, we need an expres-sion for the residual, given the generator-state CSF basis we have chosen.This has been derived by Pulay and Saeb0,35who originally presented theapproach we detail in terms of Hylleraas variational perturbation theory.36The expression derived for the residual is

Tij = (I-~11)(11)Tt;0 (I _ ~/1) (11) T . (41)

When the locality of the atomic orbitals provides computational advantage,this form is preferred.

Using the residual matrices given above, the second- and third-orderenergies as given by Eqs. (33) and (34) are evaluated as sums of matrixtraces

E(2) =E(KijOji) ,'~j (42a)

Tij = K,j + FC,jS + SC,jF - ES{l'iA:CA:j + FjA:CA:i}S, (38)A:

E(3) = E(T~:)Oj,) Ii~j

(42b)

C.',' =CT,J' .

where OJ, represents the wave function coefficientmatrix Cj, in the recip-rocal space, and the angle brackets denote the matrix trace operation. Thethird-order energy involves a residual including the interactions betweendouble substitutions

T~:) =K'j + K{C'j] + FC'jS + SCijF + B'jS + SBJ;

+ S(G'j + GJ;)S,(43)

The residual and wave function coefficientvectors have been collected into

n(n + 1)/2 matrices of dimension Nv x Nv in order to achieve a compactnotation. The followingproperty of the wave function coefficientmatrices(which can be proven from the transformation properties of the CSF basis)is assumed in evaluating the last sum

(39)

(Kij )ob= (ialjb) . (40a)

Bo; = ~{(K.. - ~J.. ) (2C,; - C;,) - ~J ..C;, _ J ;,C.. } ,

G'j = E(1 + 610,)-1{(ijlkl) - 6/jFik - 6k,Fj,} Ck/,k~/

(44)The matrices Sand F are the orbital overlap and Fock matrices, and thegeneralized exchange operator had been used

(45)

1152 Modern Electronic Structure Theory P.eudo.pectrol Method. Applied to . .. 1153

I

(K[Ci;]).." = ~)aclbd)Cif .cd

The new matrices just defined are best interpreted as representing inter-actions between various classes of doubly-substituted CSF's. The externalexchange matrix, K[Ci;], arises from the interaction of double-substitutionswhich differ only in the virtual orbital occupation pattern. Similarly, thematrix Gi; describes the interaction between doubly-substituted CSF'swhich only differ in the occupied orbital occupation pattern. Finally, thematrix Bi; represents mixed interactions, where both the virtual and oc-cupied occupation patterns of the CSF's differ.

The stage is now set for the computation of first-order wave functionsand second- and third-order energies using either a canonical or an arbitraryorbital basis. The determination of the second-order energy is dominatedby the integral transformation, scaling as O(nN4). This is the cost of thefirst step of the transformation, recognizing that only integrals with theoccupied-virtual structure of (ialjb) are required. In particular,

(46) potential energy parts, but. rather into one- and two-electron parts. Tradi-tionally, pseudospectral methods have used the former partitioning, since

. the spectralbasisiswell-suitedto the representationof kineticenergywhilethe physical space basis is appropriate for the representation of the potentialenergy. Unlike pseudospectral methods applied to quantul,lldynarnics,37.38the kinetic energy operator is not diagonal in the chosen spectral basis andthere is therefore no compelling reason to use the traditional partitioning..As Friesner has noted,3 the high accuracy expected in electronic structuretheory makes it crucial to use analytic integrals wherever possible. Sincethe one-electron integrals are all very inexpensive to evaluate, there is noreason to use numerical approximations for them.

The application of the pseudospectral approximation to any electronicstructure theory method is therefore reduced to making the substitution

(pqlrs) -+L QpgAr.(g)Rgq,9

(48)

(iqlrs)= L Uip(pqlrs),p

where U is the matrix of Fock operator eigenvectors. The remaining effortscales as O(n2N3), regardless of whether Eq. (31) or Eqs. (38) and (42a)are used. However, the iterative inversion procedure has a larger prefactordue to the iterative cycles.

The computation of the third-order energy scales as O(n2N4 + n3N3),corresponding to the assembly of n2 external exchange (K[Ci;]) and Bi;operators respectively. Again, the scaling is the same for the canonicalorbital approach (which we have not explicitly shown) and the orbital-invariant approach. Furthermore, no iteration is required for the MP3energy once the fir.3t-orderwave function has been determined, so the twoapproaches will nOt differ significantly in actual operation counts.

(47) followedby algebraic rearrangement of the resulting equations to gain anycomputational advantage. For MP2, we apply this procedure to the dom-inant step - the formation of the internal exchange matrix. Conventionalmethods would use Eq. (47), but with the pseudospectral approximation

(Ki;)pq =~ (~UirQrg) Rgp( ~ U;.Aq'(9»), (49)

5. Pscudospectral Mj1)ller-Plesset Perturbation Theory

As we have mentioned previously, the pseudospectral method applied toelectronic structure theory uses the spectral space as a primary space. Thismeans that all on.~-electronintegrals are evaluated analytically, and thephysical space representation is used only for the two-electron repulsionintegrals of Eq. (1). Thus, the Hamiltonian is not split into kinetic and

where we assume the use of the atomic orbital basis in the formation of the

residual. The two terms in parentheses are evaluated and stored at a cost ofO(nNM) and O(nMN2), respectively. The final summation over the gridtakes O(n2N2 M) operations. Recall that the dominant step of Eq. (47)took O(nN4) operations. Use of Eq. (49) is themfore faster than Eq. (47)by a factor of N2/nM. Since the number of gridpoints is proportional to thenumber of basis functions, this is not a true scaling advantage. However, itdoes lead to a reduced operation count and is easily cast in terms of matrixmultiplications, making it very efficienton modern computer architectures.More importantly. the dominant step is the final summation over the grid.Recall that R describes the basis function values at the gridpoints and weare using atom-centered Cartesian Gaussian functions for the spectral basis.These basis functions are localizedin physical space. Therefore, many terms

1154 Modem Eledronic structure Theory P.eudo.peccrol MetJaou Applied Co... 1155

in the final summation will have a negligible effect on the final outcome,

since Rgp will be very small at many gridpoints. We can take advantage ofthis by omitting terms for whichRgp is belowsomepredetermined tolerance,leading to an asymptotic (for large molecules) scaling of O(n2N2), andconcomitant scaling advantage of N. Friesner and coworkers39have coinedthe phrase "using cutoffs" to describe the procedure of limiting the sum toterms of non-negligiblemagnitude.

Unfortunately, the scaling of the iterative procedure embodied byEqs. (37) and (38) is unchanged within the pseudospectral approximation,remaining O(n2N3). This makes it a possible bottleneck in the computa-tion when Eq. (49) is used with cutoffs. For the cases we have examined,it is apparent that this will be true for large molecules. Therefore, thereis no real overall scaling advantage, but rather the pseudospectral methodis faster by roughly a factor of Nln. A reasonable estimate for this fac-tor with organic molecules and polarized valence-double-zeta basis sets isapproximately five. The pseudospectral advantage will increase as higherquality basis sets are used.

In order to point out the utility of the cutoff procedure, we presenttimings for pseudospectral MP2 on several selected molecules in Table 1.These calculations were all performed at gas-phase equilibrium geometries,as determined by Gaussian 92.40 The basis sets used are the 6-31G**family of Pople and coworkers,41and the grids are obtained from the"medium" grid parameters coded in PSGVB VO.OO(Ref. 42) based on thealgorithm detailed by Friesner.43 These grids have about 150 points peratom, but the exact number varies from case to case due to addition ofgrid points in the bonding region and merging of grid points which areclosely spaced. Symmetry was used in finding the equilibrium geometry,but not in the MP2 calculations. We include conventional timings derivedfrom both Gaussian 92 and a code we have written (SCEPTER)44 for com-parison. Notice th<~tthe pseudospectral method is slightly slower thanour conventional code for the smallest molecule, acetylene, but is fasterfor the larger molecules. This behavior is normal, and arises from thefact that the physical space grid must cover the space of the moleculewell. Therefore, hydrogen requires more gridpoints per basis functionthan a heavy atom. Furthermore, molecules which are nearly linear orplanar require more gridpoints per basis function than globular molecules.

«~...coo'"tI)p..r..

~c:I°.ac:Io~...S

5~.5..:Jp..o...p..::s

f~~o

'0:s"It'0~f..u..Po...bOr::.;;;:s

~"3u"'0e

'0....u"'iiII)

....

.2~r::'s~. ..;... ....Q) :s- Po.~ e~ s

f..u..; ~~ 8.. '0

i>~ ~" .. Po..:: c:I '0~~ ~Po..oU'c ..kU i~. u ui'O ~~ ..~:a1id" i-f -~.-l; ~0tl! ~-;08'::'" ~:s"o~.2 c:I:~Gf~S ~>!;~bOIII 1:'0 0 c:.. 'C~.o"';;;" u

..~ 1:, :s .0 ..1J;I'...~" ~'s. ~ 5 0 .5 :aG 2 .0 ~ .5 .u.o ~ ~~ -;;Ii g. 2OO-~CI .d;~ :JJ~ ~.gof~;::! .2" :s .. 1::of! ':;jg1i~~" of

~ Po.~. ~ :s".,,~~ 1i::.~'Oof"" Po.ICI-e ~ 0 IJ "0If) 0 u ~ ~~u~.-o ~~ 's IIIg~ ~"oe..JJ .5~J::~ .~:01"..>.0 ofc..:: f °i : 0.s ~~ t 0 ~e 3 ~ .~ g '21: .; e ~ ~ . ge.o:s"c:.!!'"='~ IIJ """ 't-oi U="""tJo~.s"""

o.r:: c: C :s °]6~.~~u6Eo< oS'0 E'-. ~ (t.~ ~

:::,w~E=~"O~~g~~S:;:~~e''3e+'.!I!~e

s::~=c.~'!i.!.2 :s Po...ce.- Po.e 5° II) .- Ii a ° g.:;; .~ O! Qi ; 5 O!f .D .= b .!! Q,)' ~:; '- ~ E ~ .= ] ~.,.~ ~CCil g.", goE"o~Eo<fll)O..ij'O cp. ,,"gooe :s i:'; CiIe -= i:~ z ~ « ~.- .. ~.",u."..t'e..

.. .... ...ca C') 0 0CJ .... CIO 01... C')

p:fiIEo< '" ....

CIOOt-II. ... ... C')fiI ... coo0It)

II ...§ ]..

jj co 01'i CIOco CIO0 .......c..

0

o '" :i::0.. C') '" ...:s co ...0

No 0:z; 1! ....0It-CIO...

U

"'c0

i... 0CIO....COcoo '""::.

p:. fiIo.:!:c Eo<

0 II. ... co CIOfiI

CIO ...

0E II)..oS ."..

CIO...... 0 ... t- C').. .. ......'":s'i 0te" ud '".. 2

... '" ...C') "'0:s "'...u

..,oC') .... CIO", "'------roo ---"'0III .... ... ...:z; 000

.... 01 0...

" .. ..:; r:: r::.. .. ..u - .- r::" $] 'u'0 .. .. »u :s -

«tQC!


It is perhaps surprising to see that cutoffs are already quite efficient foracetylene, speeding up the integral transformation by more than a factorof two. This points out the extreme efficiencyof using cutoffs based on R.The cutoff procedure is much less efficientwhen based on Akl(g), since thisfunction decays as 1/r.

Also note that the iterative procedure becomes more expensive than the

integral transformation when cutoffsare used. This is a manifestation of thefirst basic bottleneck in correlation methods we have discussed earlier - thewave function itself. There are O(n2N2) coefficientsfor the wave functionwe are optimizingj therefore, the scaling of the iterative procedure can neverbe less than O(n2N3), since we must perform some sort of contractionof the wave function coefficientswith the basic integrals. The only wayto solve this problem is to find an alternate representation of the wavefunction, for example the localized representation of Saeb0 and Pulay.45Their method basically restricts the virtual space for each internal pairij, reducing the number of coefficientsto O(n2L), where L is the averagenumber of basis functions used to represent the correlating space for aninternal pair. Future work on pseudospectral MP2 will involve combiningthe pseudospectral approximation with a wave function approximation likethis one.

We could also formulate a pseudospectral MP2 in terms of canonical or-bitals based on Eq. (31). The pseudospectral formula is similar to Eq. (49),except that one should fully transform all quantities before the final sum-mation on the grid

(iabb) = ~ (~ u;.q.,)(~ u.,R,,) (~ u.. (~ U;.A..)). (50)

The. final summation scales as O(n2 N~M), which can be significantly

smaller than O(n2 Jl/2M), since Ny does not include occupied or frozen or-bitals. However, the canonical orbitals are invariably delocalized, making

the cutoff procedure useless. This is a common tradeoff in the applica-tion of the pseudm;pectral approximation to correlated methods which arenaturally expressed in terms of molecular orbitals. Use of atomic orbitalformulations leads to efficient cutoffs while precluding the efficient use ofthe reduced size of the virtual space. Formulation directly ill the molecularorbital basis leads to the opposite situation. For small- and medium-sized

~

P.eudo.pectral Method. Applied to ... 1157

molecules the molecular orbital approach is usually preferred, and one cansee this from the timings presented in Table 1 under the heading Canon-ical. The pseudospectral canonical formulation is up to ten times fasterthan Gaussian 92, which uses a conventional (spectral) canonical orbitalformulation. Notice that the cases shown correlate all electrons, and the

pseudospectral advantage would be even greater in the common situationwhere the core orbitals are left uncorrelated.

The pseudospectral MP2 method is clearly worthwhilefrom the perspec-tive of computational efficiency.However,if it is not sufficiently accurate,this is not useful. Thus, we turn to the question of accuracy. To date,we have only examined the accuracy of the pseudospectral energies. Whenthe methods are established as reliable for energy determination, it will bedesirable to test their reliability for other properties. Properties dependenton derivatives of the wave function with respect to nuclear coordinates areof particular interest, given the importance of ab initio geometry optimiza-tion and vibrational frequency determination. Friesner and coworkershaveperformed geometry optimizations at the Hartree-Fock level with greatsuccess. This bodes well for the correlated methods, but actual testingawaits the development of codes which calculate the required derivativespseudospectrally.

We stress that when we speak of accuracy, we mean relative to theconventional answer- that is, we define the error in the pseudospectralmethod to be the difference between the conventional and pseudospectralanswers, not the differencebetween the pseudospectral and exact answers.This is consistent with the previous discussion concerning the desire tomimic the properties of spectral methods. Furthermore, we judge thepseudospectral method by an absolute rather than relative error. Such aharsh criterion is required for chemical applications of electronic structuretheory, and is largely the reason for the neglect of many numerical algo-rithms which sought to challenge spectral methods. The problem of courseis that chemistry is concernedwith energy differenceson the order of tens orhundreds of millihartrees, while total energies are usually tens or hundredsof hartrees. The situation is somewhat brighter for correlated methods,since correlation energies for medium-sizedorganic moleculesare generallyless than a hartree. Thus the relative accuracy required for the pseudo-spectral correlated methods is much less than that required of pseudo-spectral Hartree-Fock. We have adopted one millihartree as our target

1158 Modem Electronic Structure TheoryP,eudo'pectral Methodl Applied to ... 1159

absolute accuracy, since this is generally considered to be chemicalaccuracy.

In Table 2, we show the spectral-and pseudospectral MP2 energies forthe same set of moleculesfrom Table 1. We also include the water molecule,which was omitted from Table 1 because the CPU time required spectrallyand pseudospectrally is so little that comparisons would be meaningless.There is no error in the SCF energy because we have used the Fock matrixand its eigenvectors determined from a spectral SCF calculation. We havealso tested the use of a Fock matrix from the PSGVB 1.00 code,46whichuses certain classes of sPe<:tralintegrals in order to ensure high accuracy inthe SCF energy. The differencebetween these two approaches has alwaysbeen less than 50 microhartrees, and they are therefore equivalent for ourpurposes. Thus accuracy is not affected if one starts with a spectral orpseudospectral SCF wave function and Fock matrix. Of course, if onehas a choice, it is computationally advantageous to do the entire calcula-tion pseudospectrally and save the time required to generate the spectralintegrals.

We see that the pseudospectral MP2 correction is within 2.2 milli-hartrees of the spectral correction. This is representative of the accuracywe can easily obtain. In fact, we are showing here the worst case out often test molecules (glycine), whereas the error for other cases is typicallyless than 1.5 millihartrees. Although this is not within the target accu-racy specified above, it is close, and we are confident that the error canbe reduced by using a better dealiasing set or a slightly larger grid. Weremind the reader that the only spectral two-electron integrals which wehave used are those implicit in the Fock matrix. The actual MP2 calcula-tion uses only pseudospectral integrals. The relative error and number ofgrid points used are also quoted in Table 2 to satisfy the reader's curiosity.We achievea relative error of less than 0.5 percent in the correlation energyusing a grid which is incredibly sparse by traditional quadrature standards.Remember that these are fully three-dimensional grids and do not incorpo-rate symmetry in any way.

Unlike MP2, application of the pseudospectral approximation to MP3does yield a true :,caling advantage. We use the mapping of Eq. (48) inthe most cumberwme steps of Eq. (43), the formation of the matrices

K[Cij) and Bij. The first of these is a simple contraction of the generalizedexchange type represented by Eq. (2b). Thus, the factorization technique

."o.Q.....e'"Po.

::a

ftI'"

11Po.'"o.":s..Ii..

.Q...

...o;>,~'":svv«c-iQ):a~

....'fi1s: ...11 0~ >.11 ..

A. ~oU11..

~C'lOO<ON N N N00000000~ ....

y

"i'ii e0"". ..JI 0« I::

II

t-ooo<o00 <0 ('I')

U') U') U') ....OOr-tC'iI I I I

'i..~1:=:11 e0 ,a~"..:I~1111&'1.A.

N U') r-- 001"- r- "I$tco N N "'litoOoO~OCD <0 <0 '"

N U') 00I I I I

J>

'ii~ eu...,&"III~

11&'1

"'I'-t-NU') 00 it) ....

«)(O('I')oO~a)aOc:7) co U') ....

N U') 00I I I I

.5: at) t- <'>'"'-' MNCO

~<000<'>....C"') 0 <qt

.. N N ('I') "iIIt ~ ~ ~ IX!s:: <0 <0 '"Gt t-t-U')oo

r.. I I I' <;'QIII

11..

.;I~~~~it) r- ~ ~'t!

.;:C1

II>

~I

'"II> ..- ...o '"::E :::

.. ..s:: s::.. .. ..- .- s::~] .u.. ... ;>,v " _«~C)

]yw'"

~I

. "ii~ ;.. y... w.. '". '" ....Q 0.. ..,"

1::0 wnS~ ~.Q0t.:!

II '"II::I: .;:::I:e ~r.:I. .0 y


of Eq. (8) is used

(K(C.;]).. = ~Q., (~Aw(g)(~R"Cij' )) . (51)

The two parenthesized terms and the final sum over the grid takeO(n2N2M) operations each, compared with the spectral scaling ofO(n2N4). We examine only one of the terms in the formation of Bij,as the rest are similar. The spectral form of the term we focus on is

(L:JjkCik ) = L:Uklac)Cf:,k ab k,c

(52)

and the pseudospectral equivalent is

(~ J ;.C'.).. =p., (~A;'(9) (~ R"ct.)) . (53)

I

I

The innermost parenthesized term and the final summation over the gridscale as O(n2N2 M), while the outermost parenthesized term scales asO( n3 N M). This represents a scaling advantage of nN /M over the spectralformulation. However, one must be careful to form partial contributionsto all n2 Bij matrices at once, rather than forming each one completelyin turn. The latter procedure actually leads to a scaling of O(n3N2 M)or worse, depending on programming details. This leads to some compli-cations in the program logic if the residual is not held entirely in core,but is still quite manageable. In a spectral implementation of MP3, it isadvantageous to precompute the matrices (Kik - !Jik), which appear inthe first term of Eq. (44) prior to multiplication with the wave functioncoefficients. However, this would destroy the pseudospectral advantage.

In general, an ellicient pseudospectral formulation can never pre-combineintegrals, and therefore a penalty is paid in the form of extra matrix mul-

tiplications (a larger prefactor in the operation count).In Table 3, we present timings for our spectral and pseudospectral MP3

codes and Gaussian 92. The pseudospectral timings are obtained without

taking advantage of cutoffs. Our spectral and pseudospectral implementa-tions of MP3 are still quite immature, as is evident from a comparison of

the timings with Gaussian 92. Nevertheless, the results show quite clearly

Paeudoapectral Methoda Applied to ... 1161

that an efficientpseudospectral formulation will be significantlyfaster thanspectral formulations.

We feel that if one desires the accuracy of the third order correction,it makes sense to compute the first-order wave function spectrally. Thisrequires the computation of the spectral integrals in addition to the spectralMP2 procedure, but both of these combined will comprise less than 10%of the time required to evaluate the MP3 energy. The error in the MP3correction is decreased by an order of magnitude in this way, but moreimportantly, the error in the MP2 energy is reduced to zero. In Table 4,we present the spectral and pseudospectral MP3 energies for the same setof molecules used for MP2. The errors using a spectral first-order wavefunction are always less than 0.05 millihartree, easily falling within ourtarget threshold. If a pseudospectral first-order wave function is used, theerror in the MP3 correction is still less than 0.5 millihartree, but the error inthe total energy incorporates the error from the pseudospectral MP2. Wefind these errors to be of the same sign in general, but even where there is acancellation of error, it must be regarded as completely fortuitous. Again,this points out the need to use spectral quantities wherever it can be doneinexpensively.

Table 3. CPU timings in seconds on an FPS 521-EA for spectral and pseudospectralMP3 methods.

Table 4. Accuracy of the pseudospectral MP3 method. All energies are in millih

p es are the error with respect to th.! spectral calculation.

L--

Molecule Gaussian 92 SCEPTER Pseudospectral

Acetylene 132 203 103Butadiene 10936 16668 4603

Glycine 34302 45765 11812

Pseudospectral E(S).

Molecule Spectral E(S) Spectral,p(l) Pseudospectral,p(l)

Water -6.2086 -6.3010 (-0.0924) -6.3227 (-0.1141)

Acetylene -11.0787 -11.0966 (-0.0179) -11.1853 (-0.1066)Butadiene -36.2909 -36.3355 (-0.0446) -36.3931 (-0.1022)

Glycine -17.0829 -17.0554 (+0.0275) -17.1648 (-0.0819)

Modem Eledronic Structure Theof'1/P.eudo.pectrol Method. Applied to ... 1163

:onclu

lave apted to make the pseudospectral method as transparent aslble. \)pe that by now the reader will agree with us when we saythe fufor pseudospectral electronic structure methods in general,correll methods in particular, is quite bright. Reductions in com-.tional1 by a factor of five or ten are achieved without much ef-even gh pseudospectral technology is still quite immature. The

dospe«methodis particularly efficientin the calculation of correla-energirimarily because the pseudospectral error is well-describedrelatitor and correlation energies are rather small. Therefore, it is~rto ge absolute accuracy desired using sparse grids in correlationhods tin SCF methods.rhe psspectral method provides an easy way to take advantage oflocalitthe basis sets in the form of cutoffs, which we have shown toluite eat even for small moleculesin the MP2 method. Ultimately,Ceelthnelation methods will use a combination of pseudospectralmiquelapproximate representations of the wavefunction. Such anroach address the two basic bottlenecks in correlation treatments

makwlation of correlated wave functions for large molecules rou-!, trul)ering in a new age of quantum chemistry.Altho1hereis reason to be very optimistic, we admit there is muchre worbe done. Many practical calculations and calibrations need)e peI1!dbefore one can state the error bars on the pseudospectralrgies mnfidence. Comparison of properties other than the energyn spe<and pseudospectral correlated wave functions must also beforme4thoughthe procedures developed by Friesner and coworkersgrid aealiasingset design represent a monumental step forward, westill fan knowingas much about these aspects of the pseudospectralthod aknoVlabout basis set optimization, for example. The rewardsthororealia,ing are much larger in correlated methods than in SCF~ausebrrelated methods are so much more expensive. Thus, usingger or;er d:!aliasing sets will not affect the overall computational)rt vench.

At tbint we can say with confidence that the pseudospectral MP3!thod idy t.) be widely used. There is much program optimizationbe dout our preliminary implementation is already quite fast and:!enerp'e interchangeable with spectral ones as far as most chemical

applications are concerned. The pseudospectral MP2 method will achievethe target accuracy threshold soon, and we are excited at the prospect ofusing this method to study correlation effectsin moleculesof unprecedentedsize.

Acknowledgments

This work was supported by the Air Force Office of Scientific Research.E. A. C. is also grateful to the National Science Foundation, tbe Camilleand Henry Dreyfus Foundation, and the Alfred P. Sloan Foundation forthe Presidential Young Investigator, Teacher-Scbolar,and Research FellowAwards, respectively. T. J. M. also thanks the National ScienceFoundationfor a graduate assistantship. We also thank R. A. Friesner for supplying uswith a pseudospectral Hartree-Fock code, and M. N. Ringnalda and R. A.Friesner for many helpful discussions.

References

1. M. Haser, J. Almlc5fand G. E. Scuseria, Chern. Phy.. Lett. 181, 497 (1991).2. S. A. Orszag, Stud. Appl. Math. 51, 253 (1972).3. R. A. Friesner, Ann. Rev. Phy.. Chern. 42, 341 (1991).

4. T. J. Martinez, A. Mehta and E. A. Carter, J. Chern. Phy.. 97, 1876 (1992).5. T. J. Martinez and E. A. Carter, J. Chern. Phy.. 98, 7081 (1993).6. T. J. Martinez and E. A. Carter, J. Chern. Phy.. 100,3631 (1994).7. N. H. F. Beebe and J. Linderberg, Int. J. Quantum Chern. 12,683 (1977).8. T. J. Martinez and E. A. Carter, unpublished results.

9. M. Feyereisen, G. Fitzgerald and A. Komornicki, Chern. Phy.. LeU. 208,359(1993).

10. T. H. Dunning, Jr., J. Chern. Phy.. 90, 1007 (1989).11. C. M. Carlson, J. Chern. Phy.. 47, 862 (1967).

12. D. K. Hariss and C. M. Carlson, J. Chern. Phy.. 51, 5458 (1969) and refer-ences therein.

13. F. E. Harris and H. H. Michels, Adv. Chern. Phy.. 13, 205 (1967).14. For a rough estimate here, we assume the pseudospectral method replaces

an N4 step with an M N2 step, and that grid-based cutoffs are not used. Wealso assume all atoms in the system are heavy atoms (with 15 basis functionseach).

15. M. Abramowitz and I. A. Stegun, Handbook oj Mathematical Functiom(Dover, New York, 1968).

16. W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery,Numerical Recipe. in FORTRAN, 2nd Edition (Cambridge Univ. Press,Cambridge, 1992).

1164 Modem Electronic Structure Theory

11. G. Dahlquist and A. Bjorck, Numerical Method&(Prentice-Hall, New Jersey,1914).

18. J. C. Light and Z. Bacic, J. Chem. Phys. 85, 4594 (1986) and referencestherein.

19. R. A. Friesner, J. Chem. Phys. 85, 1462 (1986).20. M. N. Ringnalda, M. Belhadj and R. A. Friesner, J. Chem. Phys. 93, 3391

(1990).21. E. O. Brigham, The Fast Fourier Transform (Prentice-Hall, New Jersey,

1914).22. A. V. Oppenheimer and R. W. Schafer, Discrete-Time Signal Processing

(Prentice-Hall, New Jersey, 1989).23. T. J. Martinez and E. A. Carter, unpublished.24. N. C. Handy, P. J. Knowles and K. Somasundram, Theoret. Chim. Acta 68,

81 (1985).25. P. J. Knowles, J. S. Andrews, R. D. Amos, N. C. Handy and J. A. Pople,

Chem. Phys. Lett. 186, 130 (1991).26. C. W. Murray and E. R. Davidson, Chem. Phys. Lett. 187,451 (1991).21. W. J. Lauderdale, J. F. Stanton, J. Gauss, J. D. Watts and R. J. Bartlett,

Chem. Phys. Lett. 187, 21 (1991).28. K. Andersson and B. O. Roos, Int. J. Quantum Chem. 45, 591 (1993).29. R. B. Murphy and R. P. Messmer, Chem. Phys. Lett. 183, 443 (1991).30. A CSF is a given orbital occupation (spatial configuration) together with an

appropriate spin-coupling.31. P. Pulay, S. Saebf/land W. Meyer, J. Chem. Phys. 81, 1901 (1984).32. J. P. Killingbeck, Techniques of Applied Quantum Mechanics (Butterworths,

London, 1915).33. We have ignored the considerable sparsity in the Hamiltonian matrix in

making this estimate. Taking this into account will make estimates of thecost of explicit inversion much less terrifying, but it will remain nonethelessprohibitive.

34. L. A. Hageman and D. M. Young, Applied Iterative Methods (Academic,San Diego, 1981).

35. P. Pulay and S. Saeb~, Theoret. Chim. Acta 69, 351 (1986).36. E. A. Hylleraas, Z. Phys. 65, 209 (1930).31. M. D. Feit and J. A. Fleck, J. Chem. Phys. 78, 301 (1982).38. D. Kosloff and R..Kosloff, J. Cornput. Phys. 52, 35 (1983).39. M. N. Ringnalda, Y. Won and R. A. Friesner, J. Chem. Phys. 92, 1163 (1990).40. M. J. Frisch, G. W. Trucks, M. Head-Gordon, P. M. W. Gill, M. W. Wong,

J. B. Foresman, B. G. Johnson, H. B. Schlegel, M. A. Robb, E. S. Replogle,R. Gomperts, J L. Andres, K. Raghavachari, J. S. Binkley, C. Gonzalez,R. L. Martin, 0 J. Fox, D. J. Defrees, J. Baker, J. J. P. Stewart and J. A.Pople, Gaussian 92, Revision 8, Gaussian Inc., Pittsburgh, PA, 1992.

41. W. J. Hehre, R. Ditchfield and J. A. Pople, J. Chern. Phys. 56, 2251 (1912).

~

P.eudo.pectrol Method. Applied to '" 1165

42. M. N. Ringnalda, Y. Won, J. M. Langlois, B. H. Greeley, W. A. Goddard IIIand R. A. Friesner, PSGVB vO.OO,SchrOdinger, Inc., 1991.

43. R. A. Friesner, J. Phys. Chem. 92, 3091 (1988).44. T. J. Martinez and E. A. Carter, SCEPTER v1.0, unpublished.45. S. Saei>15and P. Pulay, J. Chem. Phys. 88, 1884 (1988).46. M. N. Ringnalda, J. M. Langlois, B. H. Greeley, T. V. Russo, R. P. Muller,

B. Marten, Y. Won, R. E. Donnelly, Jr., W. T. Pollard, G. H. Miller, W. A.Goddard III and R. A. Friesner, PS-GVB v1.00, SchrOdinger, Inc., 1993.

---l-- - princeton university unwise to proceed beyond the hartree-fock method with a poor basis...

Documents