accurate localized resolution of identity approach for...

15
Accurate localized resolution of identity approach for linear-scaling hybrid density functionals and for many-body perturbation theory Arvid Conrad Ihrig, 1, * urgen Wieferink, 1 Igor Ying Zhang, 1, Matti Ropo, 1, 2, 3 Xinguo Ren, 1, 4 Patrick Rinke, 1, 3 Matthias Scheffler, 1 and Volker Blum 1, 5, 1 Fritz-Haber-Institut der Max-Planck-Gesellschaft, 12149 Berlin-Dahlem, Germany 2 Tampere University of Technology, Department of Physics, FI-33101 Tampere, Finland 3 COMP/Department of Applied Physics, Aalto University School of Science, P.O. Box 11100, FI-00076 Aalto, Finland 4 Key Laboratory of Quantum Information, University of Science and Technology of China, Hefei, 230026, Anhui, China 5 Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA (Dated: May 19, 2015) A key component in calculations of the exchange and correlation energies is the Coulomb operator, which requires the evaluation of two-electron integrals. For localized basis sets, these four-center integrals are most efficiently evaluated with the resolution of identity (RI) technique, which expands basis-function products in an auxiliary basis. In this work we show the practical applicability of a localized RI-variant (RI-LVL), which expands products of basis functions only in the subset of those auxiliary basis functions which are located at the same atoms as the basis functions. We demonstrate the accuracy of RI-LVL for Hartree-Fock calculations, for the PBE0 hybrid density functional, as well as for RPA and MP2 perturbation theory. Molecular test sets used include the S22 set of weakly interacting molecules, the G3 test set, as well as the G2-1 and BH76 test sets, and heavy elements including titanium dioxide, copper and gold clusters. Our RI-LVL implementation paves the way for linear-scaling RI-based hybrid functional calculations for large systems and for all-electron many-body perturbation theory with significantly reduced computational and memory cost. PACS numbers: I. INTRODUCTION With advances in high-performance computing and improved numerical algorithms, the predictive power of first-principles electronic-structure theory is increasing in terms of accuracy and tractable system size. Density- functional theory (DFT) using (semi-)local functionals has excelled because of a very favorable combination of computational efficiency and reasonable accuracy in solid-state theory and quantum chemistry. However, it is well documented that conventional DFT methods are not sufficient to properly describe certain types of bind- ing interactions. 16 It is often necessary to use hybrid functionals 7,8 or even higher-level methods like second- order Møller-Plesset perturbation theory 9 (MP2), GW methods 10,11 or the random-phase approximation and beyond 1214 to accurately treat e.g. cohesive energies, van der Waals interactions or reaction-barrier heights. For hybrid functionals and more sophisticated treat- ments of the electronic exchange and correlation a major computational bottleneck is the evaluation of the two- electron repulsion integrals for orbitals {ψ s (r)} (st|uv)= ZZ ψ * s (r)ψ * u (r 0 )ψ t (r)ψ v (r 0 ) |r - r 0 | drdr 0 . (1) The notation (a|b) refers to the Coulomb interaction between the functions indexed by a and b, respec- tively, throughout the paper. In particular, we have (ij |b)= RR ψ * i (r)ψ j (r)v(r - r 0 )ψ b (r 0 )drdr 0 and (a|b)= RR ψ * a (r)v(r - r 0 )ψ b (r 0 )drdr 0 where v(r)=1/ |r|. Upon replacing the orbitals by their expansions in a chosen set of basis functions {ϕ i (r)|i =1,...,N basis }, Equation 1 transforms into a sum over four different ba- sis function indices. The approach for evaluating these integrals largely depends on the basis set. For Gaussian or plane wave basis sets these integrals are given ana- lytically. For numeric atom-centered orbital 1523 (NAO) basis sets {ϕ i (r - R I )} on the other hand, these integrals with four different atomic positions R I , often called four- center integrals, have to be evaluated by explicit integra- tion. A straightforward approach for basis sets with analyt- ical solutions is possible, but due to the O(N 4 basis ) inte- grals this leads to unfavorable execution times for large systems. For NAOs, on the other hand, no analytical solution exists. Computing these integrals on the fly be- comes intractable as the system size grows. Precomput- ing them is also out of the question due to the memory and communication requirements for storing and reading this matrix with O(N 4 basis ) entries. A widely used approach to alleviate the computational time requirements of the four-center integral evaluation is the “resolution of identity” (RI), also known as “den- sity fitting”. 2430 The central idea is to expand orbital products into an auxiliary basis {P μ (r)}: ϕ i (r)ϕ j (r) := ρ ij (r) X μ C μ ij P μ (r) (ij |kl) X μ,ν C μ ij (P μ |P ν ) C ν ij . This approach represents the four-index matrix in terms of three- and two-index ones and is therefore better suited

Upload: others

Post on 25-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

Accurate localized resolution of identity approach for linear-scaling hybrid densityfunctionals and for many-body perturbation theory

Arvid Conrad Ihrig,1, ∗ Jurgen Wieferink,1 Igor Ying Zhang,1, † Matti Ropo,1, 2, 3

Xinguo Ren,1, 4 Patrick Rinke,1, 3 Matthias Scheffler,1 and Volker Blum1, 5, ‡

1Fritz-Haber-Institut der Max-Planck-Gesellschaft, 12149 Berlin-Dahlem, Germany2Tampere University of Technology, Department of Physics, FI-33101 Tampere, Finland

3COMP/Department of Applied Physics, Aalto University School of Science, P.O. Box 11100, FI-00076 Aalto, Finland4Key Laboratory of Quantum Information, University of Science and Technology of China, Hefei, 230026, Anhui, China

5Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA(Dated: May 19, 2015)

A key component in calculations of the exchange and correlation energies is the Coulomb operator,which requires the evaluation of two-electron integrals. For localized basis sets, these four-centerintegrals are most efficiently evaluated with the resolution of identity (RI) technique, which expandsbasis-function products in an auxiliary basis. In this work we show the practical applicability ofa localized RI-variant (RI-LVL), which expands products of basis functions only in the subset ofthose auxiliary basis functions which are located at the same atoms as the basis functions. Wedemonstrate the accuracy of RI-LVL for Hartree-Fock calculations, for the PBE0 hybrid densityfunctional, as well as for RPA and MP2 perturbation theory. Molecular test sets used include theS22 set of weakly interacting molecules, the G3 test set, as well as the G2-1 and BH76 test sets, andheavy elements including titanium dioxide, copper and gold clusters. Our RI-LVL implementationpaves the way for linear-scaling RI-based hybrid functional calculations for large systems and forall-electron many-body perturbation theory with significantly reduced computational and memorycost.

PACS numbers:

I. INTRODUCTION

With advances in high-performance computing andimproved numerical algorithms, the predictive power offirst-principles electronic-structure theory is increasing interms of accuracy and tractable system size. Density-functional theory (DFT) using (semi-)local functionalshas excelled because of a very favorable combinationof computational efficiency and reasonable accuracy insolid-state theory and quantum chemistry. However, itis well documented that conventional DFT methods arenot sufficient to properly describe certain types of bind-ing interactions.1–6 It is often necessary to use hybridfunctionals7,8 or even higher-level methods like second-order Møller-Plesset perturbation theory9 (MP2), GWmethods10,11 or the random-phase approximation andbeyond12–14 to accurately treat e.g. cohesive energies,van der Waals interactions or reaction-barrier heights.

For hybrid functionals and more sophisticated treat-ments of the electronic exchange and correlation a majorcomputational bottleneck is the evaluation of the two-electron repulsion integrals for orbitals {ψs(r)}

(st|uv) =

∫∫ψ∗s (r)ψ∗u(r′)ψt(r)ψv(r

′)

|r− r′| drdr′. (1)

The notation (a|b) refers to the Coulomb interactionbetween the functions indexed by a and b, respec-tively, throughout the paper. In particular, we have(ij|b) =

∫∫ψ∗i (r)ψj(r)v(r − r′)ψb(r

′)drdr′ and (a|b) =∫∫ψ∗a(r)v(r− r′)ψb(r

′)drdr′ where v(r) = 1/ |r|.Upon replacing the orbitals by their expansions in

a chosen set of basis functions {ϕi(r)|i = 1, . . . , Nbasis},Equation 1 transforms into a sum over four different ba-sis function indices. The approach for evaluating theseintegrals largely depends on the basis set. For Gaussianor plane wave basis sets these integrals are given ana-lytically. For numeric atom-centered orbital15–23 (NAO)basis sets {ϕi(r−RI)} on the other hand, these integralswith four different atomic positions RI, often called four-center integrals, have to be evaluated by explicit integra-tion.

A straightforward approach for basis sets with analyt-ical solutions is possible, but due to the O(N4

basis) inte-grals this leads to unfavorable execution times for largesystems. For NAOs, on the other hand, no analyticalsolution exists. Computing these integrals on the fly be-comes intractable as the system size grows. Precomput-ing them is also out of the question due to the memoryand communication requirements for storing and readingthis matrix with O(N4

basis) entries.A widely used approach to alleviate the computational

time requirements of the four-center integral evaluationis the “resolution of identity” (RI), also known as “den-sity fitting”.24–30 The central idea is to expand orbitalproducts into an auxiliary basis {Pµ(r)}:

ϕi(r)ϕj(r) := ρij(r) ≈∑

µ

CµijPµ(r)

⇒ (ij|kl) ≈∑

µ,ν

Cµij (Pµ|Pν)Cνij .

This approach represents the four-index matrix in termsof three- and two-index ones and is therefore better suited

Page 2: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

2

to be precomputed and stored. The auxiliary basis func-tions {Pµ(r)} can either be predefined or calculated atruntime. The most commonly used variant is the “RI-V” scheme,24,29–31 which determines the expansion co-efficients by directly minimizing the errors in the four-center integrals. This scheme has the advantage that theerror in the four-center integrals is quadratic in the er-ror of the actual RI expansion.24,25 This approximation,together with the fact that the total energy is station-ary with respect to the expansion coefficients, makes theRI-V a “robust variational” scheme25.

Our group has successfully applied the RI to NAOs andshown it to be a reliable and robust approach.32 Addi-tionally, a general procedure for the construction of theauxiliary basis was presented that is particularly accu-rate for on-site products of atomic orbitals, which carrythe bulk of the total energy. While RI expansions forany atomic basis set are particularly efficient for smallsystems, the scaling behavior, both in terms of computa-tional time and memory usage, is problematic because ofthe nonlocal nature of expansions based on the Coulombmetric.33 (See Equation 6 and 7 below.) In principle, thisnonlocality is artificial as a local quantity like the prod-uct of two localized orbitals should not need expansionfunctions arbitrarily far away, and it makes the applica-tion to large or even periodic systems34 infeasible or evenimpossible.

There are several previous studies exploring ways tolimit the number of nonvanishing expansion coefficients.For example Reine et al. replaced the global Coulombmetric with a localized one.35 Sodt et al. followed a morepragmatic approach by restricting the expansion of a ba-sis function product to the subset of those auxiliary basisfunctions which are centered within a sphere around oneof the two basis function centers.36,37 Pisani et al. fol-lowed a similar approach with a more general fitting do-main for local MP2-calculations in periodic systems.38,39

Recently, Merlot et al. presented a localized RI, whichuses only auxiliary basis functions centered on the twocenters of the expanded basis functions as a startingpoint.40 To reach a reasonable accuracy, they then ap-plied a Cholesky decomposition scheme to converge theintegration accuracy by adding further auxiliary func-tions along the connection line between the two centers.Even more localized density fitting approaches were al-ready known in the early days of computational quantumchemistry. Of particular interest for the present workis the limited expansion of diatomic overlap (LEDO) ofBillingsley and Bloor24,41. In this work, the authors ex-panded products of Slater type orbitals (STOs) on dif-ferent atoms in the on-site products of these two atomiccenters.

In addition to using localized variants of the RI-approach, several other techniques for the evaluation offour-center integrals have been investigated in the litera-ture. Delley evaluated the products of molecular orbitalson an atom-centered numerical grid and used multipoletechniques to solve the Poisson equation.42 Already in

the early 1980s,43 Talman showed how to use Fourierspace techniques44,45 to spherically expand NAO prod-ucts around a point between the two orbital centers, andhow to efficiently calculate the Coulomb interactions be-tween such expansions.43,46 Recently, a more efficient wayof spherically expanding orbital products based on Legen-dre integration was published by the same author.47 Foer-ster showed how to construct a compact auxiliary basis aposteriori from Talman’s expansion43,46 for a given atompair by means of a singular value decomposition.48,49

In contrast to our strictly atom-centered approach, thiswork employs bond-centered auxiliary basis sets. An-other technique is the expansion of the NAOs in termsof a Gaussian function set, which is implemented in theSIESTA-code50,51. While this strategy has the advan-tage of offering an analytic solution, each NAO must befitted by several Gaussians, which partially undoes thebenefits of the compactness of NAO basis sets. Most ofthese studies aim at computing the four-center integralsdirectly, while the aim of the RI-approach is to computea set of intermediate expansion coefficients that can bestored in memory. Some of the integration techniquesdeveloped by Talman are employed in our work as well.

In the present paper we demonstrate an approach thatis similar to LEDO, but gives accurately converged to-tal energies for NAOs, for Hartree-Fock and hybrid DFTfunctionals, as well as for correlation methods like MP2and RPA. A similar atomic RI strategy has recently beenimplemented for Gaussian-type orbitals by Manzer etal.52. In contrast to our work, they only apply their newalgorithm to the exchange integrals of the SCF-cycle atpresent.

Our method has recently been applied in a linear-scaling implementation of hybrid density functionals forperiodic systems34 and allows for simple derivatives,used, e.g., in a corresponding implementation of thestress tensor for hybrid density functionals53.

We will introduce our method, which is implemented inthe Fritz Haber Institute ab initio molecular simulationspackage (FHI-aims16,54), as a localized modification ofour global fitting approach reported in Ref. 32. In Sec-tion II we briefly review our RI-scheme and introduceour new localized variant. Section III shows an exten-sive study on the convergence and accuracy for differentmethods, and in Section IV we compare the computa-tional scaling with our RI-V implementation. Finally,we summarize our results and draw conclusions in Sec-tion V.

II. THEORY

The basic RI implementation in FHI-aims is describedextensively in Ref. 32. We here briefly recapitulate thekey equations before proceeding to the localized expan-sion. Finally, we explain how the localized variant canbe adapted in a controlled way to further increase theaccuracy which is required to minimize the linear error

Page 3: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

3

terms arising in our localized approach (see below).

A. Global expansion

The nonlocal exchange operator as part of the gener-alized Kohn-Sham or Fock matrix is defined by

Kil :=∑

jk

Djk (ij|kl) (2)

where Djk is the one-particle density matrix and (ij|kl)are four-center integrals of basis functions ϕi(r), analo-gous to the four-center integrals defined in Equation 1for the orbitals ψs(r).

In the RI, orbital products ϕi(r)ϕj(r) := ρij(r) areexpanded in a set of auxiliary basis functions Pµ(r):

ρij(r) ≈∑

µ

CµijPµ(r) =: ρij(r) (3)

to give the approximated product ρij(r). The two-electron integrals (Equation 1) are then reduced to:

(ij|kl) =∑

µ,ν

CµijVµνCνkl (4)

Vµν = (Pµ|Pν)

Several strategies have been proposed to determine theexpansion coefficients Cµij . The most common variant, isknown as “RI-V”. This approach directly minimizes theerror of the four-center integrals. Whitten has shown24

that this can be achieved by minimizing the Coulombintegral of the fitting error (δρij |δρij):

δρij(r) = ρij(r)− ρij(r)

= ϕi(r)ϕj(r)−∑

µ

CµijPµ(r).

The minimization problem then leads to a system of lin-ear equations:

0!=

1

2

∂ (δρij |δρij)∂Cµij

=∑

ν

(µ|ν)Cνij − (µ|ij) (5)

yielding the expansion coefficients and the approximatedfour-center integrals:

Cµij =∑

ν

(ij|ν)V −1νµ (6)

⇒ (ij|kl) ≈∑

µ,ν

(ij|µ)V −1µν (ν|kl) . (7)

The four-index matrix we started with is now repre-sented by only two- and three-index objects, implyinga significant advantage in terms of memory requirementsand computational time. Furthermore, the error of theapproximated four-center integrals is quadratic in the

residual δρij(r) = ρij(r) − ρij(r) of the fitted orbitalproduct.25

Unfortunately, the Coulomb kernel v(r − r′) is long-ranged, so that (µ|ν) and (µ|ij) are neither sparse norshort-ranged except when the overlap between ϕi and ϕjvanishes. The consequence is a canonical O(N3) com-putational and memory scaling with system size for theprecomputed expansion coefficients Cµij . It can be re-

duced to O(N2) memory scaling by taking the vanishingoverlaps into account, but in both cases there is a largeprefactor in actual computations. The formal O(N4)scaling for the exchange matrix construction is not re-duced by the RI approach, unless density matrix screen-ing techniques55 are used when the exchange matrix iscomputed. Since the Coulomb metric distributes the ex-pansion coefficients Cµij for any given (localized) density

ρij(r) over all auxiliary basis functions Pµ(r) in the sys-tem, O(N) scaling can no longer be achieved for the ex-change operator.

B. Construction of the standard auxiliary basis

In our implementation16,32, all atom-centered basisfunctions ϕi(r) are defined as:

ϕi(r) = uskl(r)Ylm(φ, θ) (8)

where uskl(r) is a radial function with index k and an-gular momentum l for species s and Ylm(φ, θ) denotes aspherical harmonic. In the non-periodic cases we investi-gate in this work, both the radial part and the sphericalharmonics are real-valued functions. The basis functionsare constructed numerically and are subject to a cut-offpotential Vcut(r), which enforces a smooth decay to zeroof the basis functions outside the relevant range.16 Theconstruction of the confined basis functions is describedin detail in equations 8 and 9 of Ref 16. In short, allradial functions in FHI-aims are solutions to:

[−1

2

d2

dr2+l(l + 1)

r2+ Vi(r) + Vcut(r)

]ui(r) = εiui(r).

(9)

Our auxiliary basis functions Pµ(r) are not predefined,but rather derived from the basis set {ϕi(r)} used to ex-pand the orbitals ψ(r). Our construction scheme for thePµ(r) ensures an exact reproduction of those integrals(ij|kl) with basis functions i, j, k, l centered at the sameatom, which account for the major part of the total en-ergy. We therefore construct our auxiliary basis set suchthat it contains all on-site products of basis functionsϕi(r) · ϕj(r) explicitly. Specifically, the auxiliary basisset can be derived from two different underlying basissets:

OBS: the standard orbital basis set used to expand theKohn-Sham orbitals of the DFT calculation

Page 4: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

4

additional functionsfrom the enhancedorbital basis (OBS+)

radial basis functionsuskl(r) of orbitalbasis set (OBS)

on-site radial productsusk1l1(r) · usk2l2(r)

Gram-Schmidt orthonor-malization to removelinear dependencies

product functions with|l1 − l2| ≤ l ≤ |l1 + l2|

maximal angular mo-menta lmax

s per species

angular momentumchannels 0 . . . 2lmax

s

for each species

distribute auxiliary basisfunctions for species toall atoms of this kind

auxiliary basis set forsystem (Pµ(r) functions)

Figure 1: The construction of the auxiliary basis in FHI-aims:Starting from the orbital basis set, the on-site products oftwo radial functions (with subscripts s, k, l for species, indexand angular momentum) are orthonormalized. The productof the resulting function with spherical harmonics representsthe auxiliary basis functions. One important feature of thisapproach is that we can add further radial functions (greenbox) to the orbital basis set, which are only used during thisprocedure and do not influence the performance of the orbitalbasis in any way.

OBS+: an enhanced OBS with additional functions,which are used only for the construction of thePµ(r) functions

In Figure 1 the auxiliary basis construction is illus-trated as a flowchart. For each atom species s in thesystem we compute all on-site products usk1l1(r)usk2l2(r)of the radial basis functions uskl(r). These products arethen orthogonalized with a Gram-Schmidt scheme. Aradial product is accepted as a new radial auxiliary ba-sis function in a given angular momentum channel if thenorm of its projection orthogonal to the already acceptedfunctions is larger than a small value εorth (for RI-V: 10−2

for atoms with Z ≤ 10, 10−3 for atoms with 10 ≤ Z ≤ 18and 10−4 for all other elements).

The choice of this parameter controls the size ofthe auxiliary basis {Pµ(r)} and thus the expansionaccuracy.32 The accepted products are then multipliedwith each spherical harmonic Ylm(φ, θ) which satisfies theClebsch-Gordon condition |l1 − l2| ≤ l ≤ |l1 + l2| to ob-tain the Pµ(r). The maximum angular momentum of anybasis function in the OBS or OBS+ of a given species,lmaxs , thus also controls the maximum angular momen-tum of any basis function in the auxiliary basis set, 2lmaxs .For each species s the maximal angular momentum of theproduct functions lcaps can be capped to reduce the sizeof the auxiliary basis, but this comes at the price of a

reduced accuracy.

C. Construction of the enhanced auxiliary basis

Our standard procedure for the construction of the{Pµ(r)} has the advantage of producing an atom-centered auxiliary basis that automatically adapts to theorbital basis set and that yields a particular high accu-racy for on-site products. The off-site products on theother hand are not included in the auxiliary basis setand therefore will not be expanded exactly. To obtainaccurate results for these integrals too, we must use asufficiently large auxiliary basis set. This can be readilyachieved with our construction scheme by enhancing theOBS with additional functions, which are used only forthe construction of the auxiliary basis set, as shown bythe green box in Figure 1. This enhanced OBS is labeledOBS+ and yields a larger set of {Pµ(r)} functions. Thechoice of additional basis functions will be discussed indetail in Section III A.

D. Localized expansion

As our aim in this work is to reduce the number of ex-pansion coefficients substantially, we restrict the expan-sion coefficients Cµij to be non-zero only if the auxiliaryfunction µ is centered at one of the two atoms at whichthe basis functions i and j are centered. This restrictedexpansion of the product ϕi(r)ϕj(r) (centered on atomsI and J) can be written as

ρij(r) =∑

µ∈P(IJ)

CµijPµ(r) (10)

where P(IJ) := P(I)∪P(J) denotes the set of all auxil-iary functions centered at atom I or atom J , as depictedin Figure 2.

Since the RI-V prescription has proven to be a robustscheme, we use the same minimization target for our lo-calized variant. In practice, this means that we mini-mize the Coulomb self-repulsion (δρij |δρij) of the resid-ual δρij(r) from the restricted expansion:

δρij(r) = ρij(r)− ρij(r)

= ϕi(r)ϕj(r)−∑

µ∈P(IJ)

CµijPµ(r)

with respect to variations in Cµij .The minimization condition then leads to a set of linearequations similar to the one for RI-V (Equation 5):

0!=

1

2

∂ (δρij |δρij)∂Cµij

=

P(IJ)∑

ν

(µ|ν)Cνij − (µ|ij) (11)

for the coefficients Cνij . Both indices µ and ν are re-stricted to the subset P(IJ). As a consequence, the sys-tem of linear equations can be decomposed into a set of

Page 5: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

5

subproblems, one for each atom pair. Finally, the expan-sion coefficients and four-center integrals can be writtenas

Cνij =

µ∈P(IJ)

LIJνµ (µ|ij) ν ∈ P(IJ)

0 else

(12)

with LIJ = (VIJ)−1 where VIJ is a square matrix withthe entries (µ|ν) for µ, ν ∈ P(IJ). The dimension of LIJ

is therefore N IJaux×N IJ

aux with N IJaux being the number of

{Pµ(r)} centered on atoms I and J, which is only a smallfraction of the total number of auxiliary basis functionsNaux. If we now insert Equation 12 into Equation 4, weobtain

⇒ (ij|kl) ≈∑

µλ∈P(IJ)νσ∈P(KL)

(ij|λ)LIJλµVµνLKLνσ (σ|kl) (13)

where the sums are restricted to the subset P(IJ) be-cause of Equation 12. We name this expansion “RI-LVL”after the matrices occurring in Equation 13.

One advantage of this scheme lies in the fact that thesize of the local Coulomb matrices VIJ required for eachatom pair is independent of the system size. Their num-ber initially increases quadratically, but with increasingsystem size the number of relevant pairs scales linearly.Due to the cubic scaling of matrix inversions, the inver-sion of several local Coulomb matrices becomes favorablefor large systems compared to the single inversion of aglobal Coulomb matrix.

Furthermore, we no longer need the full three indexobject (ij|µ), but only the parts relevant for Equation 12.Additionally, the calculation of the remaining three-indexintegrals (ij|µ) is simplified within RI-LVL because the

JI J

K

ϕi(r) ∈ I ϕj(r) ∈ J∣∣∣∣∣ϕi(r) · ϕj(r)

∣∣∣∣∣ ≥ 0

Pµ(r) ∈ P(I) Pν(r) ∈ P(J)

Pλ(r) ∈ P(K) (not used in local expansion)

Figure 2: The localized nature of the RI-approach: The func-tion product ϕi(r) · ϕj(r) is to be expanded in terms of theauxiliary basis set {Pσ} centered at atom I or J . Auxiliarybasis functions centered on other atoms, like Pλ on atom K,are not used.

center of µ is either I or J . The product of the Coulombpotential of Pµ with, say, ϕi is a simple superposition ofatom-centered functions, whose overlap with ϕj can becalculated in Fourier space using the methods suggestedby Talman43,46 as described in Ref. 32. Together withthe strong reduction in the number of these integrals thecomputational time can be reduced to an extent where itis small compared to the other parts of the calculation.

E. Error analysis

The following exact relation of four-center integralsand their RI approximations

(ij|kl) = (ρik|ρjl) + (ρik|δρjl)+ (δρik|ρjl) + (δρik|δρjl) (14)

is important for assessing the accuracy of RI schemes.Here, ρik(r) = ϕi(r)ϕk(r) are the exact products, ρik(r)their RI approximations and δρik(r) = ρik(r)− ρik(r) arethe residuals. The first term is the straightforward RIexpansion used so far. The next two terms are linear inthe residual, whereas the last is quadratic in the residual.

In RI-V, the minimization criterion for the residual(Equation 5) can be rewritten as 0 = (µ|δρik). Since allapproximated products ρ(r) are expanded in the {Pµ(r)},the linear terms must vanish in RI-V.

The localized version, on the other hand, only enforcesthis condition for those {Pµ(r)} which are centered on thesame atoms as the basis function composing the product.Therefore, the linear terms will only vanish for those inte-grals for which {I,K} = {J, L}, where the capital lettersdenote the atom index of the corresponding orbital basisfunction. For all other integrals, the error is linear in theresiduals.

Some localized RI-schemes alleviate this error by us-ing the robust Dunlap correction25. Localized schemescan be extended to yield quadratic errors in the residualby including the linear error terms from Equation 14 inEquation 13:

˜(ij|kl) = (ρik|ρjl) + (ρik|ρjl)− (ρik|ρjl) (15)

In RI-V, all of these three terms are identical and there-fore reduce to the previously used formula (Equation 7)for the four-center integral. This expression has a fewdrawbacks: To calculate the correction terms, all of the3 index overlap integrals (µ|ρjl) are required. (This alsoincludes those for which Pµ(r) is not centered on the sameatom as ϕj(r) or ϕl(r). ) Since the sparsity in the three-index overlap integrals is the key to our memory andcomputational time savings, this correction would undomost of our achievements. Furthermore, it also has beenrecently demonstrated40 that this expression can lead toserious convergence problems in rare cases, because thematrix for the approximated four-center integrals is notnecessarily positive semi-definite. Since negative eigen-values correspond to attractive electron-electron interac-tions, the usual safeguards to ensure SCF convergence

Page 6: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

6

will lead to a high ratio of rejected steps, slowing SCFconvergence down or preventing convergence at all.

Due to these drawbacks, we do not employ the Dun-lap correction for our localized scheme. As we will showin Section III, the combination of a localized RI (Equa-tion 13) with localized numeric atom-centered orbitalsnevertheless reproduces the results obtained from RI-V with essentially exact accuracy while significantly de-creasing the computational requirements. The key step isan appropriate enhancement of the auxiliary basis itself.

III. RESULTS

In this section we demonstrate that our RI-V imple-mentation is a reliable benchmark for RI-LVL by com-paring it to the Gaussian-type orbital (GTO) basedNWChem56 code that does not employ the RI approx-imation. In this step, we use Gaussian-type functionsin FHI-aims so that we can perform the comparison forexactly the same basis functions. We then evaluate theaccuracy for RI-LVL for Hartree-Fock(HF)57 and second-order Møller-Plesset perturbation theory9 (MP2), as wellas for the PBE08 hybrid functional and the random phaseapproximation (RPA).12–14

As explained in the previous Section, we enhance thenormal OBS with additional functions to improve theaccuracy of the resulting {Pµ(r)}. Throughout this arti-cle we only add hydrogen-like functions with the minimalprincipal quantum number for the given angular momen-tum, i.e. nodeless radial functions. For these functions,we have Vi(r) = −z/r in Equation 9. In the follow-ing, spd(z=x) will denote the enhancement of the OBS,i.e. which additional functions OBS+ contains, as de-scribed in Section II B. spd indicates the added angularmomentum channels (s, p, d, f etc.) and z=x denotesthe effective charge parameter of the added hydrogenicbasis functions. Neither for RI-V nor RI-LVL we did im-pose any constraints on the maximal angular momentum(lcaps ) of the {Pµ(r)}.

Individual total energy values for all structures in allbenchmark sets used for each method can be found inthe supporting information (SI).

A. Hartree-Fock and MP2 calculations

To demonstrate the general applicability of our ap-proach, we first discuss Hartree-Fock and MP2 total en-ergies for RI-V and RI-LVL in FHI-aims using Gaussianbasis sets and compare them to the results obtained withNWChem. We then show that the same degree of accu-racy can be obtained using the NAO basis functions ofFHI-aims.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22system index [S22 testset]

−103

−102

−101

−100

0

100

101

102

103

∆Etot[m

eV]

HF with cc-pVTZ basis set

RI-VRI-LVL

RI-LVL+auxlog scale

linear scale

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22system index [S22 testset]

−103

−102

−101

−100

0

100

101

102

103

∆Etot[m

eV]

MP2 with cc-pVTZ basis set

RI-VRI-LVL

RI-LVL+auxlog scale

linear scale

Figure 3: Total errors in the total energy computed with FHI-aims using RI-V and RI-LVL, compared to NWChem. Theupper panel shows the errors of Hartree-Fock calculations andthe lower one for the MP2-correction in the S22-testset usingthe cc-pVTZ Gaussian basis set. “RI-LVL+aux” uses an ad-ditional g-function for constructing the auxiliary basis. SeeSI for details.

1. Gaussian Type Orbitals

To establish a valid reference for our comparison be-tween RI-V and RI-LVL, we first calculate total energiesfor the molecular geometries compiled in the S22 testset of Jurecka and coworkers58 with the Gaussian basisset cc-pVTZ59. This test set contains 22 weakly bondeddimers with 6 to 30 atoms per dimer, ranging from thewater dimer to adenine-thymine complexes. This testset allows us to test total energies of molecular systemsas well as weak binding energies, which usually requireRPA or MP2 for an accurate description. Figure 3 showsthe total energy errors for all molecular dimers and Ta-ble I shows the root mean square deviation (RMSD)and maximum absolute value (MAX) for the error peratom (not counting hydrogens) and the systemwide er-

Page 7: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

7

∆Etot HF MP2(meV) RMSD MAX RMSD MAX

RI-Vtotal 0.383 0.716 0.182 0.445

per atom 0.038 0.082 0.028 0.085

RI-LVLtotal 9.129 18.423 168.995 440.565

per atom 0.743 1.062 11.432 29.371RI-LVL+aux

total 0.527 0.936 0.581 1.144per atom 0.051 0.087 0.055 0.111

Table I: Root Mean Square Deviation and maximum (abso-lute) value of the total energy error (total and per atom, notcounting hydrogens) of RI-V in the S22-test set compared toNWChem using the cc-PVTZ basis set. (See also Figure 3)

ror. The results show that the RI-V implementation inFHI-aims is reliable and accurate,32 yielding deviationsin total energy of less than 1 meV per system for Hartree-Fock calculations. Also the MP2 corrections, which aremore sensitive to the accuracy of the four-center inte-grals, are of similar quality, with a RMSD below 0.2 meV.Our RI-V implementation is therefore a suitable refer-ence choice to benchmark RI-LVL and will be employedas such throughout the rest of this work.

The straightforward localized resolution of identityon the other hands yields significantly higher errors ofup to 29.5 meV/atom (up to 440 meV total error) forMP2 calculations and also gives less accurate results forHartree-Fock, although the discrepancies there are lessdramatic. However, the plot also shows an additional setof curves, labeled “RI-LVL+aux”, which corresponds to

0

100

101

102

103

∆Etot[m

eV]

RI-V RI-LVL RI-LVL+aux RI-V RI-LVL RI-LVL+aux

HF MP2

HF and MP2 in the G3-test set (cc-pVTZ basis)

log scalelinear scale

RMSD MAX

Figure 4: RMSD and MAX errors in the total energy com-puted with FHI-aims using RI-V and RI-LVL, compared toNWChem. The left side shows the errors of Hartree-Fockcalculations and the right one for the MP2-correction in theG3-test set using the cc-pVTZ Gaussian basis set. ”RI-LVL+aux” uses an additional g-function for constructing theauxiliary basis. The dimer-systems from the test set wereexcluded because they are of little interest for the accuracyanalysis of the RI-LVL. See SI for details.

RI-LVL with an enhanced auxiliary basis (using a sin-gle hydrogenic g function with z = 6). This data setshows that RI-LVL yields an accuracy similar to RI-Vprovided a suitable chosen OBS+ and thus auxiliary ba-sis set {Pµ(r)} is chosen. We will discuss the choice ofOBS+ in detail below.

In addition to the S22 test set we also investigatedthe performance of our new method in the larger G3-testset60. This test set contains 223 molecules composed offirst and second row elements. As in the S22-case we usedNWChem as a reference for our RI-implementations. Theresults for HF and MP2 total energies are shown in Fig-ure 4. We have not included the 28 dimer systems in thestatistical analysis, because RI-V and RI-LVL are thesame procedures for dimer systems.

Overall, RI-V achieves a very good accuracy with anRMSD of the total energy of about 1 meV. As was al-ready the case in the S22-test set, RI-LVL itself performsworse and yields errors with tens or even hundreds ofmeV magnitude. However, enhancing the auxiliary basiswith the same functions we used in the S22 test set, wecan again recover the same accuracy as RI-V. A few testcases still exhibit total energy errors up to 5.5 meV. Forall these cases the RI-LVL results lie exactly on top ofthe RI-V results, which is a strong indication that theseerrors are not specific to RI-LVL. It should also be notedthat almost all the outliers contain chlorine atoms.

2. Numeric atom-centered orbitals

For the remainder of the article, we will return to theNAOs basis sets and focus on improving the accuracy of

0 1 2 3 4 5 6Radius in A

0.00

0.25

0.50

0.75

1.00

func

tion

ampl

itude

1s 2p 3d 4f 5g

0 1 2 3 4 5 6Radius in A

0.00

0.25

0.50

0.75

1.00

func

tion

ampl

itude

z=1 z=2 z=3 z=4 z=5 z=6 z=7

Figure 5: Radial shape of the employed basis functions. Theupper panel shows basis functions with different angular mo-mentum and constant effective charge z = 1. The lower panelshows 5g functions with different effective charges z. Thedashed line in both panels shows the onset of the basis con-fining potential.

Page 8: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

8

HF RI-V ∆Etot (meV)aux basis per atom systemwideadditions RMSD MAX RMSD MAXs(z=1) 0.005 0.016 0.040 0.124sp(z=1) 0.009 0.024 0.090 0.201spd(z=1) 0.011 0.020 0.119 0.318spdf(z=1) 0.023 0.064 0.341 1.219spdfg(z=1) 0.154 0.367 2.320 6.976

Table II: RMSD and maximum absolute errors for Hartree-Fock calculations in the S22 test set using the RI-V methodwith tier 2 basis sets. The functions listed under ”aux basisadditions” are added to those from the bare basis set whenthe product basis is constructed.

RI-LVL. For HF calculations we use the standard tier 2basis sets16 with tight settings. For MP2 calculations,on the other hand, we use the recently developed valencecorrelation consistent NAO basis sets.61

As seen in Figure 3, our standard procedure to con-struct the {Pµ(r)} yields sufficiently accurate total ener-gies at the Hartree-Fock level. However, it is does notsuffice to converge MP2 total energies with our localizedRI. A straightforward way to deal with this error is toincrease the size of the auxiliary basis. This is achievedby using OBS+ instead of OBS for the construction ofthe auxiliary basis set, as discussed in Section II C.

We first show that our reference is well converged withrespect to the auxiliary basis by enhancing the originalOBS with functions of increasing angular momentum (seeupper panel in Figure 5). Then we apply the same sys-tematic approach to RI-LVL and finally we demonstrate

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22system index [S22 testset]

−103

−102

−101

−100

0

100

101

102

103

∆Etot[m

eV]

HF (RI-LVL) using tight tier2 basis setstier2()tier2(s)tier2(sp)

tier2(spd)tier2(spdf)tier2(spdfg)

tier2(g)log scalelinear scale

Figure 6: Convergence of the HF total energy error of the RI-LVL approximation with increasing auxiliary basis set size.The difference between the RI-LVL calculations and the RI-Vreference using the pure tight (tier 2) basis set is shown. Theblack line shows the performance of the OBS+ with a singleg-function(z=1), which we consider sufficient for productioncalculations. (see also Table III)

HF RI-LVL ∆Etot (meV)aux basis per atom systemwideadditions RMSD MAX RMSD MAX

none 0.594 1.891 3.959 6.432spdfg(z=1) 0.052 0.124 0.493 1.466g(z=1) 0.066 0.153 0.633 1.799g(z=2) 0.059 0.137 0.567 1.663g(z=3) 0.054 0.127 0.533 1.590g(z=4) 0.053 0.125 0.514 1.544g(z=5) 0.053 0.123 0.491 1.454g(z=6) 0.047 0.107 0.465 1.305g(z=7) 0.046 0.098 0.462 1.226

Table III: RMSD and maximum absolute errors for Hartree-Fock calculations in the S22 test set using the RI-LVL methodwith tier 2 basis sets. The functions listed under “aux basisadditions” are added to those from the bare basis set whenthe product basis is constructed.

that an OBS+ with only a single extra g function (seelower panel in Figure 5) is sufficient.

Table II shows the impact of this enhanced {Pµ(r)} onthe HF total energies calculated with RI-V. Even if theauxiliary basis set is significantly enlarged with extendedfunctions having high angular momenta, the change in to-tal energy is below 0.16 meV per atom (max 0.37 meV).Figure 6 shows that the error of RI-LVL systematically

converges with increasing auxiliary basis set size. By in-cluding additional radial functions with angular momen-tum s, p, d, f and g in the auxiliary basis constructionthe root mean square deviation of the total energy peratom can be reduced from 1.9 meV to 0.13 meV. There-fore, the error introduced in a Hartree-Fock calculationby using the local RI approximation can be controlledsystematically by adding higher angular momenta to the{Pµ(r)} and decreases to a negligible value.

In Figure 6 it can be seen that the addition of f andg functions has the largest impact. It turns out that asingle g function is already sufficient. As shown in Ta-ble III, the accuracy of the hierarchical approach can evenbe surpassed if we alter the effective charge that definesthe added g function, but the general quality of the resultis unaffected by the choice of the effective charge. Thereason why the results do not depend strongly on the

MP2 RI-V ∆Etot (meV)aux basis per atom systemwideadditions RMSD MAX RMSD MAXs(z=1) 0.003 0.008 0.031 0.076sp(z=1) 0.013 0.045 0.050 0.123spd(z=1) 0.012 0.036 0.066 0.126spdf(z=1) 0.011 0.027 0.076 0.144spdfg(z=1) 0.031 0.098 0.494 1.856

Table IV: RMSD and maximum absolute errors for MP2@HFin the S22 test set using the RI-V method with the NAO-VCC-3Z basis set. The functions listed under “aux basis ad-ditions” are added to those of the bare basis set when theproduct basis is constructed.

Page 9: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22system index [S22 testset]

−103

−102

−101

−100

0

100

101

102

103

∆Etot[m

eV]

MP2 (RI-LVL) using NAO-VCC-3Z3Z()3Z(s)3Z(sp)

3Z(spd)3Z(spdf)3Z(spdfg)

3Z(g)log scalelinear scale

Figure 7: MP2 total energy errors of the RI-LVL calculationsusing the NAO-VCC-3Z basis set with additional auxiliarybasis functions, compared to the RI-V calculation with thepure basis set. The black line shows the performance of theOBS+ with a single g-function(z=1), which we recommendfor production calculations.

shape (see lower panel of Figure 5) of the added functionis the basis-confining potential (see Equation 9). As canbe seen in Figure 5, the onset and peak position of theradial function depends on the effective charge, but theirbehavior in the outer regions is similar. Without theconfinement potential, the radial functions would extendfurther and the results would probably have a significantdependence on the effective charge.

We also investigated the use of RI-LVL in MP2 calcu-lations. For these calculations we use the recently devel-oped compact NAO based valence correlation consistentbasis sets61 that facilitate an extrapolation to the com-plete basis set limit.

As can be seen from Table IV, RI-V with the 3Z-version of these basis sets is essentially insensitive to an

MP2@HF RI-LVL ∆Etot (meV)aux basis per atom systemwideadditions RMSD MAX RMSD MAX

none 13.919 32.970 195.751 494.545spdfg(z=1) 0.047 0.104 0.548 1.141g(z=1) 0.063 0.111 0.859 2.006g(z=2) 0.064 0.114 0.881 2.003g(z=3) 0.059 0.104 0.814 1.879g(z=4) 0.054 0.097 0.737 1.780g(z=5) 0.036 0.070 0.483 1.324g(z=6) 0.028 0.074 0.364 1.035g(z=7) 0.046 0.082 0.610 1.384

Table V: RMSD and maximum absolute errors for MP2@HFin the S22 test set using the RI-LVL method with the NAO-VCC-3Z basis set. The functions listed under “aux basis ad-ditions” are added to those of the bare basis set when theproduct basis is constructed.

enlarged auxiliary basis set, yielding at most a changeof 0.1 meV per atom compared to the pure basis set. Inthe following analysis, RI-V with the NAO-VCC-3Z basisset with the standard auxiliary basis set will therefore beused as the reference to benchmark the MP2 results.

Figure 7 shows the accuracy for MP2 using the samerepresentation as Figure 6 did for HF. In essence, we findthe same results as for Hartree-Fock: If functions up toangular momentum channel f are included in OBS+, theerror can be reduced to 0.43 meV per atom (6.52 meVfor the system) at most, while an additional g functionreduces the error even further to at most 0.11 meV peratom (1.15 meV for the system). As for Hartree-Fock andthe tier basis, the accuracy of the hierarchical approachcan also be achieved with fewer auxiliary basis functions.Table V shows that a single g-function reduces the totalenergy error to the same order of magnitude as the ad-dition of s, p, d, f and g functions. As for HF we findthat the general quality of the results is insensitive to thevariation of the effective charge, but can be fine-tuned toyield errors slightly below those of the systematic ap-proach.

B. PBE0 and RPA

To demonstrate RI-LVL’s broad range of applicability,we also performed calculations with the PBE0 exchange-correlation functional8 and the RPA.12–14

For PBE0 we use tier 2 basis sets. To converge theRPA total energy, larger NAO basis sets are required.62

As in the cases of HF or MP2, for PBE0 the RI-V showsno significant dependence on the added auxiliary basisfunctions (data not shown). When the product basis isenhanced with additional auxiliary functions, the RI-LVLresults can be controlled and converged in a systematicway. In contrast to HF, we find that no OBS+ is neededto obtain accurate results with PBE0.

Table VI shows the relevant errors for the hierarchi-cal enhancement of the auxiliary basis. As for Hartree-Fock, we can significantly reduce the RMSD and maxi-mal value of the total energy error per atom, in this caseby more than one order of magnitude from 0.19 meV to

PBE0 RI-LVL ∆Etot (meV)aux basis per atom systemwideadditions RMSD MAX RMSD MAX

none 0.068 0.185 0.543 0.941s(z=1) 0.070 0.188 0.557 0.964sp(z=1) 0.060 0.159 0.464 0.792spd(z=1) 0.039 0.091 0.334 0.588spdf(z=1) 0.012 0.028 0.115 0.237spdfg(z=1) 0.007 0.017 0.065 0.168

Table VI: RMSD and maximum absolute errors for PBE0 inthe S22 test set using the RI-LVL method with the tier 2basis set. The functions listed under “aux basis additions”are added to those of the bare basis set when the productbasis is constructed.

Page 10: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

10

3Z() 3Z(s) 3Z(sp) 3Z(spd) 3Z(spdf) 3Z(spdfg)Basis Set [S22 test set]

0

100

101

102

103

∆Etot[m

eV]

RPA@PBE0 (RI-LVL) using NAO-VCC-3Z basis sets

log scalelinear scale

RMSD (total)MAX (total)

RMSD (atom)MAX (atom)

Figure 8: Root mean square deviation and maximal (abso-lute) errors for RPA@PBE0 in the S22 test set using RI-LVLwith the NAO-VCC-3Z basis set and different auxiliary basissets.

0.02 meV. If the auxiliary basis is only enhanced by asingle g-function, we find results similar to those fromthe Hartree-Fock analysis: The effective charge of extrafunction in the OBS+ has no significant impact on theachieved accuracy.

For RPA@PBE0 calculations using the NAO-VCC-3Zbasis set we find almost the same behavior as for MP2:The RI-V reference is again well converged and shows nosensitivity to the added auxiliary basis functions (datanot shown) and the RI-LVL error can be well controlledby the hierarchical auxiliary basis set additions, as shownin Figure 8 for the RMSD and maximal absolute error.The only difference we found is that the reduction of theadditional auxiliary basis functions to a single g-functionis not as effective as for the HF/MP2 case. In Table VIIwe listed the RMSD and maximal absolute errors for dif-ferent effective charges of the added g-function. As can

RPA RI-LVL ∆Etot (meV)aux basis per atom systemwideadditions RMSD MAX RMSD MAX

none 9.119 17.611 120.283 266.504spdfg(z=1) 0.041 0.130 0.221 0.597g(z=1) 0.059 0.385 0.792 7.324g(z=2) 0.060 0.383 0.795 7.286g(z=3) 0.059 0.387 0.791 7.353g(z=4) 0.058 0.390 0.785 7.416g(z=5) 0.059 0.406 0.796 7.712g(z=6) 0.058 0.417 0.804 7.932g(z=7) 0.057 0.402 0.788 7.642

Table VII: Root mean square deviation and maximum ab-solute errors for RPA@PBE0 in the S22 test set using theRI-LVL method with the NAO-VCC-3Z basis set. The func-tions listed under “aux basis additions” are added to those ofthe bare basis set when the product basis is constructed.

G2-1∗ (meV) BH76 (meV)RMSD MAX RMSD MAX

HF 0.060 0.184 0.077 0.181PBE0 0.023 0.093 0.019 0.045MP2 0.064 0.188 0.079 0.232RPA 0.105 0.429 0.079 0.200

Table VIII: Absolute differences between RI-V and RI-LVLfor the root mean square deviation and maximal absolute er-ror for the G2-1 test set (restricted to non-dimers) and thereaction barrier height test set BH76. Computational detailsare given in the text. See SI for more details.

be seen, the values are significantly larger than for thelargest analyzed hierarchical auxiliary basis set, but stillsmall on an absolute scale. The error per atom is in allcases below 0.5 meV and the RMSD for the total erroris approximately 0.8 meV.

C. Atomization energies and reaction barrierheights

In addition to total energies we also computed the at-omization energies in the G2-1 test set66,67 and reactionbarrier heights in the BH76 test set6,68 with RI-LVL.The G2-1 testset is based on experimental reference dataand the BH76 reference values are obtained from com-plete basis set extrapolations of coupled cluster calcula-tions including singles, doubles and perturbative triples.The employed geometries are taken from the original pa-pers. For the G2-1 test set we have only included thosemolecules in the error analysis which have at least threeatoms, because RI-V and RI-LVL are identical in thedimer-case. In the RI-LVL calculations we use an auxil-iary basis set constructed from an OBS+ with a g func-tion with z = 6. All RI-V calculations are performedwith the standard OBS and {Pµ(r)}. All calculations usethe NAO-VCC-3Z OBS. As shown in Table VIII, RI-LVLperforms equally well for binding energies as for total en-ergies.

D. RI-LVL for heavy elements

In addition to the properties of light elements from thefirst and second row of the periodic table, we also inves-tigated the performance of RI-LVL for a few representa-tive heavier elements, namely copper, gold and titaniumdioxide. The cluster geometries we used are displayed inFigure 9.

Figure 10 shows the RMSD and maximum absoluteerror per atom for titanium dioxide, copper and goldclusters. For the heavy elements we used either tier 3(titanium, oxygen and gold) or tier 4 (copper) basis setsto demonstrate the applicability of our new method to atypical RPA-calculation setup. We know from the previ-ous results that light elements like oxygen need enhanced

Page 11: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

11

Figure 9: The clusters we used to benchmark the performance of RI-LVL for heavy elements. The geometries of all clusterscan be found in the supporting information. The gold63 and titanium-dioxide clusters64 are the results from Basin Hoppingand Genetic Algorithm-based structure optimizations. The copper clusters65 are cut-out from bulk geometries.

auxiliary basis set for an accurate treatment with RI-LVL. Therefore we added an enhanced auxiliary basisset with an additional g-function (z = 1) for oxygen intitanium dioxide. For the explicit three-center integrals(ij|µ) of RI-V, dense multicenter integration grids areneeded for heavy elements. We note that the (ij|µ) in-tegrals are two-center in RI-LVL and can be carried outusing Talman’s method, therefore significantly lighter in-tegration grids are sufficient for our RI-LVL implementa-tion than for RI-V. With very tight integration settings(see Appendix A for the details) particularly for RI-V,

0

100

101

102

103

∆Etot[m

eV]

titanium dioxide copper gold

PBE0

RPA

PBE0

RPA PBE

RPA

RI-LVL convergence for heavy element clusters

log scalelinear scale

RMSD (per atom)MAX (per atom)

MAX (total)

Figure 10: Total energy error of the RI-LVL RPA@PBE0calculations for TiO2 (2-8 units), Cu (2-6,9 and 12 atoms)and Au clusters(2-7 atoms). The reference point is the RI-Vcalculation with the standard tier3/4 basis sets. In the TiO2

case the oxygen atoms had an OBS+ with a g-function forthe RI-LVL calculation. See SI for details.

RPA-RMSD per atom RMSD peratom (meV)OBS+ with g [z=6]) vs. standard OBS

gold RI-V 0.438OBS: tier 3 RI-LVL 0.187

copper RI-V 0.193OBS: tier 4 RI-LVL 0.043

titanium dioxide RI-V 0.101OBS: tier 3∗ RI-LVL 0.022

Table IX: Influence of the enhanced auxiliary basis set onthe accuracy of RPA@PBE0-calculations using the RI-V andRI-LVL strategies. More details about the employed compu-tational settings can be found in the text. For the titaniumdioxide testcases we used a OBS+ with an additional g func-tion on oxygen. Only the OBS+ on the titanium atoms waschanged.

our RI-LVL results match the total energies of RI-V forour chosen systems with an RMSD below 1.3 meV peratom even for gold clusters. In contrast to lighter el-ements, our results suggest that the auxiliary basis iswell saturated and has enough flexibility to compute thefour-center integrals accurately without adding furtherelements to the OBS.

To substantiate this observation, we computed the sys-tems again with an g-function in the OBS+ and other-wise identical parameters. The results are shown in Ta-ble IX.

Enhancing the RI-LVL calculations with additionalauxiliary basis functions on the heavy elements changesthe results only slightly. The RI-V calculations on theother hand show a larger, but overall still insignificantsusceptibility to the additional functions in the {Pµ(r)}.For example the RMSD changes by 0.44 meV for goldwhen computed with RI-V, while RI-LVL only exhibitsa change of 0.19 meV.

Page 12: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

12

101 102

system size [number of atoms]

100

101

102

103

104

105

time

[sec

onds

]RI-V, total timeRI-LVL, total time

RI-V, time for exchange matrixRI-LVL, time for exchange matrix

(a)Total time for calculation and Fock matrix evaluation (b)Alanine-8

0 20 40 60 80 100 120 140system size [number of atoms]

0

500

1000

1500

2000

2500

3000

mem

ory

usag

epe

rCP

U[M

B]

RI-V, averageRI-LVL, average

RI-V, min/maxRI-LVL, min/max

(c)memory consumption per core

0 20 40 60 80 100 120 140system size [number of atoms]

0.000

0.005

0.010

0.015

0.020

∆Eatom

tot

[mev

]

EVtot − ELV L

tot

(d)Energy difference per atom between RI-V and RI-LVL

Figure 11: Scaling analysis of self-consistent Hartree-Fock calculations for fully extended oligoalanine chains. The figures showthe RI-V implementation and the RI-LVL version that uses sparsity as well as integral screening. For all calculations we used180 cores, if not stated otherwise. As computational settings we used the tight defaults settings and the tier 2 basis set. For theRI-LVL calculations we furthermore added one more radial g-function with z = 6 for the construction of the auxiliary basis.

From the presented RMSD and maximum errors (Fig-ure 10) we can see that RI-LVL is very accurate for heavyelements and can reproduce the RI-V results with no sig-nificant error. The deviation for the error per atom isconsistently below 1.5 meV and even the total error is afew meV at most for large basis sets. We have addition-ally verified that these conclusions also hold for smallerFHI-aims tier1 basis sets as the OBS for Ti, Cu, and Au,with maximum overall errors for RPA of 3 meV (Au7)and errors for the PBE0 hybrid functional consistentlybelow 1 meV. The good performance for all three ele-ments shows that the RI-LVL has no problems dealingwith d- and f-electrons.

IV. COMPUTATIONAL SCALING ANALYSIS

In addition to the accuracy of the localized resolu-tion of identity, we also investigated the scaling withsystem size, i.e. the number of atoms, both in termsof memory consumption and the required computationaltime. To obtain a superior scaling with our localizedvariant, specialized routines are required that operateon the resulting sparse tensors instead of on the full 3-function overlap tensor. At present, such routines existfor the evaluation of the Fock matrix within FHI-aims.This implementation is discussed in detail in a separatepublication.34 In particular this implementation employsintegral screening as used in standard quantum chem-istry linear scaling exchange algorithms based on densitymatrix screening.55,69

Figure 11 shows the scaling of the RI-LVL (usingsparsity and screening) in comparison to RI-V for the

Page 13: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

13

Hartree-Fock level of theory for extended polyalaninechains. One of them, namely Alanine-8 (Ala8) is depictedin Subfigure 11(b). All calculations were performed us-ing the tier 2 basis sets16 and tight integrations settings(see table 1 in Ref. 70 for details). The auxiliary basis inRI-LVL was enhanced with one extra g-function (z = 6).

Subfigure 11(a) shows the total time required for theself-consistent HF calculation, as well as the portion thatis spent on the Fock matrix evaluations. In small sys-tems, the screened RI-LVL runs slower compared to sim-ple BLAS3 dense matrix multiplications used in RI-V.The break-even point occurs at about 70 atoms. For RI-LVL, in contrast to RI-V, the Fock matrix evaluation isthe single dominant contribution to the total time, re-gardless of the system size. This shows that the localizedscheme significantly speeds up the construction of theauxiliary basis.

Subfigure 11(c) shows how the memory consumptionfor RI-V and RI-LVL scales with system size. While RI-LVL has a large constant offset, it scales linearly andbreaks even at only 50 atoms. The favorable memoryscaling is due to the fact that we no longer need to storethe complete three-center integrals (ij|µ). Each core phas a peak memory requirement Mp. For both methodsan increase in the spread between minimal (minp(Mp))and maximal (maxp(Mp)) memory consumption per corecan be observed with increasing system size. This isa consequence of a load-balancing algorithm34. Subfig-ure 11(d) shows the difference in the total energy peratom between RI-V and RI-LVL. The error per atom is

0 50 100 150 200 250 300 350system size [number of atoms]

0

1000

2000

3000

4000

5000

time

[sec

onds

]

RI-V, time for second SCF-step (180 cores)RI-LVL, time for second SCF-step (24 cores)

cubic fitlinear fit

Figure 12: Time for the Fock matrix evaluation in 2nd SCFstep of various polyalanines. Compared are the RI-V imple-mentation and the RI-LVL version that uses sparsity as wellas integral screening. For all calculations we used 180 cores,if not stated otherwise. As computational settings we usedthe tight defaults settings and the tier 2 basis set. For theRI-LVL calculations we furthermore added one more radialg-function with z = 6 for the construction of the auxiliarybasis.

insignificant and does not increase with system size, butinstead levels out at a very low value.

Finally, we extrapolated the scaling of the computa-tional time for the algorithm with system size by analyz-ing the scaling of its dominant step, namely the evalu-ation of the exchange matrix. In Figure 12 we presentthe scaling of the time necessary for the exchange matrixevaluation as a function of the system size approximatedby the second SCF-iteration. Shown are the results forRI-V using the standard auxiliary basis and 180 cores(to accommodate the memory needs of the largest sys-tem tested) and for RI-LVL with the enhanced auxiliarybasis on 24 cores. Our localized resolution of identityexhibits a superior scaling (linear vs. cubic) and despiteusing significantly fewer cores, the break-even point oc-curs at around 110 atoms.

V. CONCLUSION AND OUTLOOK

In this work, we present a way to improve the scalingof the resolution of identity technique with the numberof atoms by restricting the expansion to a subset of theauxiliary basis functions. “RI-LVL” only includes auxil-iary basis functions which are located at one of the twoatoms where the basis functions forming the product arecentered. Importantly, the Coulomb metric of RI-V isretained.

We show for Hartree-Fock, DFT-PBE0, MP2, andRPA that the error of RI-LVL can be controlled in asystematic way by enhancing the set of orbital basis func-tions, upon which the auxiliary basis for RI is based, withspecific, structure- and method-independent additionalradial functions before the auxiliary basis set {Pµ(r)} it-self is built. For light elements, by using functions upto angular momentum channel g we were able to reducethe root mean square deviation of the total energy (com-pared to an RI-V calculation with the pure basis set)below 1 meV for the S22-testset. Even for sub-meV totalenergy accuracy, we found that a single g function in theOBS+ is sufficient. For heavy elements, no enhancementof the orbital basis set for the purpose of constructingthe RI auxiliary basis set is needed. We also showed thatRI-LVL has a similar performance for reaction barrierheights and atomization energies in molecular systems.The RI-LVL method also requires less dense integrationgrids for the same accuracy as RI-V for heavy elementswith very large basis sets.

The RI-LVL method together with these auxiliary ba-sis sets paves the way towards low-memory, linear-scalingimplementations of Hartree-Fock exchange, MP2, RPAand even higher-level methods.

Acknowledgements

This work was partially supported by the DFG Clus-ter of Excellence UNICAT. Patrick Rinke acknowledges

Page 14: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

14

funding from the Academy of Finland through its Cen-tres of Excellence Program (#251748).

Appendix A: Integration Parameters for heavyelement tests

FHI-aims uses an atom-centered grid for comput-ing most integrals.16,54 The grid is composed of radial”shells”71 (see also equation 18 in Ref. 16 and the ap-pendix in 61) and each shell is populated with a givennumber of angular integration points that form a Lebe-dev grid72–75. The key controlling parameters for theaccuracy of the integration routines are the number ofradial shells and the number of angular grid points as afunction of the distance to the center. For our investi-gation of heavy elements, we use dense integration grids,because our reference point, the RI-V, is sensitive to thequality of the grid. The details are listed in Table X.For the sake of consistency we decided to use the sameintegration settings also in RI-LVL. In a production cal-

culation the settings can be reduced significantly, becauseRI-LVL does not use the atom-centered grid for the four-center integral evaluation and the normal DFT part hasmuch lower demands on the grid settings.

quantity Au Cu Ti ONr 443 323 293 221

Nbaser 73 53 48 36

rmult 6 6 6 6router 7 A 7 A 7 A 7 A

min(Nang) 110 110 110 110max(Nang) 974 974 974 974

Table X: Highly converged integration settings for the differ-ent atoms. Nr denotes the final number of radial shells. Nbase

r

is the initial number of shells before the enhancement factorrmult, also known as the radial multiplier, is applied and routeris the outermost radius of the base integration grid. (see ap-pendix of Ref 61 for details) Nang is the number of grid pointson a integration shell.

∗ Electronic address: [email protected]† Electronic address: [email protected]‡ Electronic address: [email protected] A. J. Cohen, P. Mori-Sanchez, and W. Yang, Chemical

Reviews 112 (2011).2 L. Curtiss, P. C. Redfern, and K. Raghavachari, The Jour-

nal of Chemical Physics 123 (2005).3 X. Ren, A. Tkatchenko, P. Rinke, and M. Scheffler, Phys-

ical Review Letters 106 (2011).4 I. Y. Zhang and X. Xu, The Journal of Physical Chemistry

Letters 4 (2013).5 Y. Zhang, X. Xu, and W. A. Goddard, Proceedings of

the National Academy of Sciences of the United States ofAmerica 106 (2009).

6 Y. Zhao, B. J. Lynch, and D. G. Truhlar, The Journal ofPhysical Chemistry A 108 (2004).

7 A. D. Becke, The Journal of Chemical Physics 98 (1993).8 J. P. Perdew, M. Ernzerhof, and K. Burke, The Journal of

Chemical Physics 105 (1996).9 C. Møller and M. Plesset, Physical Review 46 (1934).

10 L. Hedin, Physical Review 139 (1965).11 F. Caruso, P. Rinke, X. Ren, A. Rubio, and M. Scheffler,

Physical Review B 88 (2013).12 D. Bohm and D. Pines, Physical Review 92 (1953).13 F. Furche, Physical Review B 64 (2001).14 X. Ren, P. Rinke, C. Joas, and M. Scheffler, Journal of

Materials Science 47 (2012).15 B. Delley, The Journal of Chemical Physics 92 (1990).16 V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. Ren,

K. Reuter, and M. Scheffler, Computer Physics Communi-cations 180 (2009).

17 F. Averill and D. Ellis, The Journal of Chemical Physics59 (1973).

18 A. Zunger and A. Freeman, Physical Review B 15 (1977).19 B. Delley and D. Ellis, The Journal of Chemical Physics

76 (1982).

20 K. Koepernik and H. Eschrig, Physical Review B 59(1999).

21 O. Sankey and D. Niklewski, Physical Review B 40 (1989).22 J. M. Soler, E. Artacho, J. D. Gale, A. Garcıa, J. Junquera,

P. Ordejon, and D. Sanchez-Portal, Journal of Physics:Condensed Matter 14 (2002).

23 S. D. Kenny and a. P. Horsfield, Computer Physics Com-munications 180 (2009).

24 J. L. Whitten, The Journal of Chemical Physics 58 (1973).25 B. Dunlap, Journal of Molecular Structure: THEOCHEM

501-502 (2000).26 B. I. Dunlap, N. Rosch, and S. B. Trickey, Molecular

Physics 108 (2010).27 S. F. Boys and I. Shavitt, Tech. Rep. (1959).28 F. Weigend, M. Haser, H. Patzelt, and R. Ahlrichs, Chem-

ical Physics Letters 294 (1998).29 O. Vahtras, J. Almlof, and M. Feyereisen, Chemical

Physics Letters 213 (1993).30 B. Dunlap, J. Connolly, and J. Sabin, The Journal of

Chemical Physics 71 (1979).31 K. Eichkorn, O. Treutler, H. Ohm, M. Haser, and

R. Ahlrichs, Chemical Physics Letters 240 (1995).32 X. Ren, P. Rinke, V. Blum, J. Wieferink, A. Tkatchenko,

A. Sanfilippo, K. Reuter, and M. Scheffler, New Journal ofPhysics 14 (2012).

33 Y. Jung, A. Sodt, P. M. W. Gill, and M. Head-Gordon,Proceedings of the National Academy of Sciences of theUnited States of America 102 (2005).

34 S. Levchenko, X. Ren, J. Wieferink, P. Rinke, V. Blum,M. Scheffler, and R. Johanni, Computer Physics Commu-nications 192 (2015).

35 S. Reine, E. Tellgren, A. Krapp, T. Kjærgaard, T. Hel-gaker, B. Jansik, S. Høst, and P. Salek, The Journal ofChemical Physics 129 (2008).

36 A. Sodt, J. E. Subotnik, and M. Head-Gordon, The Jour-nal of Chemical Physics 125 (2006).

Page 15: Accurate localized resolution of identity approach for ...th.fhi-berlin.mpg.de/site/uploads/Publications/RI-LVL.pdf · ready known in the early days of computational quantum chemistry

15

37 A. Sodt and M. Head-Gordon, The Journal of ChemicalPhysics 128 (2008).

38 C. Pisani, M. Busso, G. Capecchi, S. Casassa, R. Dovesi,L. Maschio, C. Zicovich-Wilson, and M. Schutz, The Jour-nal of Chemical Physics 122 (2005).

39 C. Pisani, L. Maschio, S. Casassa, M. Halo, M. Schutz,and D. Usvyat, Journal of Computational Chemistry 29(2008).

40 P. Merlot, T. Kjærgaard, T. Helgaker, R. Lindh,F. Aquilante, S. Reine, and T. B. Pedersen, Journal ofComputational Chemistry 34 (2013).

41 F. P. Billingsley II and J. E. Bloor, J. Chem. Phys. 55(1971).

42 B. Delley, Journal of Physics: Condensed Matter 22(2010).

43 J. D. Talman, The Journal of Chemical Physics 80 (1984).44 J. D. J. Talman, Journal of Computational Physics 29

(1978).45 J. D. Talman, Computer Physics Communications 180

(2009).46 J. D. Talman, International Journal of Quantum Chem-

istry 93 (2003).47 J. D. J. Talman, International Journal of Quantum Chem-

istry 107 (2007).48 D. Foerster, The Journal of Chemical Physics 128 (2008).49 D. Foerster, P. Koval, and D. Sanchez-Portal, The Journal

of Chemical Physics 135 (2011).50 H. Shang, Z. Li, and J. Yang, The Journal of Chemical

Physics 135 (2011).51 X. Qin, H. Shang, H. Xiang, Z. Li, and J. Yang, Interna-

tional Journal of Quantum Chemistry 115 (2014).52 S. F. Manzer, E. Epifanovsky, and M. Head-Gordon, Jour-

nal of Chemical Theory and Computation 11 (2015).53 F. Knuth, C. Carbogno, V. Atalla, V. Blum, and M. Schef-

fler, Computer Physics Communications 190 (2015).54 V. Havu, V. Blum, P. Havu, and M. Scheffler, Journal of

Computational Physics 228 (2009).55 C. Ochsenfeld, C. a. White, and M. Head-Gordon, The

Journal of Chemical Physics 109 (1998).56 M. Valiev, E. J. Bylaska, N. Govind, K. Kowalski, T. P.

Straatsma, H. J. J. Van Dam, D. Wang, J. Nieplocha,E. Apra, T. L. Windus, et al., Comp. Phys. Comm. 181(2010).

57 A. Szabo and N. Ostlund, Modern quantum chemistry: in-troduction to advanced electronic structure theory (1989).

58 P. Jurecka, J. Sponer, J. Cerny, and P. Hobza, PhysicalChemistry Chemical Physics 8 (2006).

59 T. H. Dunning, The Journal of Chemical Physics 90(1989).

60 L. Curtiss, K. Raghavachari, P. C. Redfern, and J. a. Pople,The Journal of Chemical Physics 112 (2000).

61 I. Y. Zhang, X. Ren, P. Rinke, V. Blum, and M. Scheffler,New Journal of Physics 15 (2013).

62 J. Paier, X. Ren, P. Rinke, G. E. Scuseria, A. Gruneis,G. Kresse, and M. Scheffler, New Journal of Physics 14(2012).

63 L. M. Ghiringhelli, P. Gruene, J. T. Lyon, D. M. Rayner,G. Meijer, A. Fielicke, and M. Scheffler, New Journal ofPhysics 15 (2013).

64 S. Bhattacharya, B. H. Sonin, C. J. Jumonville, L. M. Ghir-inghelli, and N. Marom, submitted to Physical Review Let-ters (2015).

65 X. Ren, P. Rinke, and M. Scheffler, Physical Review B 80(2009).

66 L. Curtiss and K. Raghavachari, The Journal of ChemicalPhysics 94 (1991).

67 L. Curtiss, P. C. Redfern, K. Raghavachari, and J. a. Pople,The Journal of Chemical Physics 106 (1997).

68 Y. Zhao, N. Gonzalez-Garcıa, and D. G. Truhlar, The jour-nal of physical chemistry A 109 (2005).

69 E. Schwegler, M. Challacombe, and M. Head-Gordon, TheJournal of Chemical Physics 106 (1997).

70 F. Schubert, M. Rossi, C. Baldauf, K. Pagel, S. Warnke,G. von Helden, F. Filsinger, P. Kupser, G. Meijer, M. Sal-wiczek, et al., Physical Chemistry Chemical Physics 17(2015).

71 J. Baker, J. Andzelm, A. Scheiner, and B. Delley, TheJournal of Chemical Physics 101 (1994).

72 V. Lebedev, USSR Computational Mathematics andMathematical Physics 15 (1975).

73 V. Lebedev, USSR Computational Mathematics andMathematical Physics 16 (1976).

74 V. Lebedev and D. Laikov, Doklady. Mathematics 59(1999).

75 B. Delley, Journal of Computational Chemistry 17 (1996).