BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Linear Scaling DFT based on DaubechiesWavelets for Massively Parallel Archiecture
L Ratcliff1, Stephan Mohr1,2, Paul Boulanger1,3,Luigi Genovese1, Damien Caliste1, Stefan Goedecker2,
Thierry Deutsch
1Laboratoire de Simulation Atomistique (L Sim), CEA Grenoble2Institut fur Physik, Universitat Basel, Switzerland
3Institut Neel, CNRS, Grenoble
Centre Blaise Pascal, Lyon
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Outline
1 The machinery of BigDFT (cubic scaling version)Mathematics of the waveletsMain OperationsMPI/OpenMP and GPU performances
2 O(N ) (linear scaling) BigDFT approachMinimal basis set approachPerformance and AccuracyFragment approachApplicationsPerspectives
3 Conclusions
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
A basis for nanosciences: the BigDFT project
STREP European project: BigDFT(2005-2008)Four partners, 15 contributors:CEA-INAC Grenoble (T.Deutsch), U. Basel (S.Goedecker),U. Louvain-la-Neuve (X.Gonze), U. Kiel (R.Schneider)
Aim: To develop an ab-initio DFT code basedon Daubechies Wavelets, to be integrated inABINIT, distributed under GNU-GPL license
After six years, BigDFT project still alive and wellBigDFT formalism from HPC perspective
Wavelets for O(N) DFT
Resonant states of open quantum systems
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Why do we use wavelets in BigDFT?
AdaptivityOne grid, two resolution levels in BigDFT:
• 1 scaling function (“coarse region”)
• 1 scaling function and 7 wavelets(“fine region”)
Ideal for big inhomogeneous systemsEfficient Poisson solver, capable ofhandling different boundary conditions –free, wire, surface, periodicExplicit treatment of charged systemsEstablished code with many capabilites
-1.5
-1
-0.5
0
0.5
1
1.5
-6 -4 -2 0 2 4 6 8
x
φ(x)
ψ(x)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
A brief description of wavelet theory
Two kind of basis functions
A Multi-Resolution real space basisThe functions can be classified following the resolution level theyspan.
Scaling FunctionsThe functions of low resolution level are a linear combination ofhigh-resolution functions
= +
φ(x) =m
∑j=−m
hjφ(2x− j)
Centered on a resolution-dependent grid: φj = φ0(x− j).
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
A brief description of wavelet theory
WaveletsThey contain the DoF needed to complete the information which islacking due to the coarseness of the resolution.
= 12 + 1
2
φ(2x) =m
∑j=−m
hjφ(x− j) +m
∑j=−m
gjψ(x− j)
Increase the resolution without modifying grid spaceSF + W = same DoF of SF of higher resolution
ψ(x) =m
∑j=−m
gjφ(2x− j)
All functions have compact support, centered on grid points.
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Adaptivity of the mesh
Atomic positions (H2O)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Adaptivity of the mesh
Fine grid (high resolution)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Adaptivity of the mesh
Coarse grid (low resolution)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Basis set features
Orthogonality, scaling relation∫dx φk (x)φj (x) = δkj φ(x) =
1√2
m
∑j=−m
hjφ(2x− j)
The hamiltonian-related quantities can be calculated up to machineprecision in the given basis.
The accuracy is only limited by the basis set (O(
h14grid
))
Exact evaluation of kinetic energyObtained by convolution with filters:
f (x) = ∑`
c`φ`(x) , ∇2f (x) = ∑
`
c` φ`(x) ,
c` = ∑j
cj a`−j , a` ≡∫
φ0(x)∂2x φ`(x) ,
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
http://www.bigdft.org version 1.7.x
• Isolated, surfaces and 3D-periodic boundary conditions(k-points, symmetries)
• All XC functionals of the ABINIT package (libXC library)
• Hybrid functionals, Fock exchange operator
• Direct Minimisation and Mixing routines (metals)
• Local geometry optimizations (with constraints)
• External electric fields (surfaces BC)
• Born-Oppenheimer MD
• Vibrations
• Unoccupied states
• Empirical van der Waals interactions
• Saddle point searches (NEB, Granot & Bear)
• All these functionalities are GPU-compatible
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Optimal for isolated systems
Test case: cinchonidine molecule (44 atoms)
10-4
10-3
10-2
10-1
100
101
105 106
Abs
olut
e en
ergy
pre
cisi
on (
Ha)
Number of degrees of freedom
Ec = 125 Ha
Ec = 90 Ha
Ec = 40 Ha
h = 0.3bohr
h = 0.4bohr
Plane waves
Wavelets
Allows a systematic approach for moleculesConsiderably faster than Plane Waves codes.10 (5) times faster than ABINIT (CPMD)Charged systems can be treated explicitly with the same time
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Systematic basis set
Two parameters for tuning the basisThe grid spacing hgrid
The extension of the low resolution points crmult
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Massively parallel
Two kinds of parallelisationBy orbitals (Hamiltonian application, preconditioning)
By components (overlap matrices, orthogonalisation)
A few (but big) packets of dataMore demanding in bandwidth than in latency
No need of fast network
Optimal speedup (eff. ∼ 85%), also for big systems
Cubic scaling codeFor systems bigger than 500 atomes (1500 orbitals) :orthonormalisation operation is predominant (N3)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Orbital distribution scheme
Used for the application of the hamiltonianThe hamiltonian (convolutions) is applied separately onto eachwavefunction
ψ5
ψ4
ψ3
ψ2
ψ1
MPI 0
MPI 1
MPI 2
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Coefficient distribution scheme
Used for scalar product & orthonormalisationBLAS routines (level 3) are called, then result is reduced
ψ5
ψ4
ψ3
ψ2
ψ1
MPI 0 MPI 1 MPI 2
Communications are performed via MPI ALLTOALLV
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
OpenMP parallelisation
Innermost parallelisation level(Almost) Any BigDFT operation is parallelised via OpenMP
4 Useful for memory demanding calculations
4 Allows further increase of speedups
4 Saves MPI processes and intra-node Message Passing
6 Less efficient thanMPI
6 Compiler and systemdependent
6 OpenMP sectionsshould be regularlymaintained 1.5
2
2.5
3
3.5
4
4.5
0 20 40 60 80 100 120
OM
P S
peed
up
No. of MPI procs
2OMP3OMP6OMP
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
High Performance Computing
Localisation & Orthogonality→ Data localityPrincipal code operations can be intensively optimised
Optimal for application on supercomputers
Little communication, bigpackets of data
No need of fast network
Optimal speedup
Efficiency of the order of 90%,up to thousands of processors
88
90
92
94
96
98
100
1 10 100 1000
Effi
cien
cy (
%)
Number of processors
(with orbitals/proc)
32 at
8
4
2
65 at
2
1
173 at
44
22
11
5
3257 at
16
4
11025 at
4
2
1
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Task repartition for a small system (ZnO, 128 atoms)
0
10
20
30
40
50
60
70
80
90
100
16 24 32 48 64 96 144 192 288 576 1
10
100
1000P
erce
nt
Sec
onds
(lo
g. s
cale
)
No. of cores
1 Th. OMP per core
CommsLinAlgConvCPUOtherTime (sec)Efficiency (%)
Data repartition optimal for material accelerators (GPU)Graphic Processing Units can be used to speed up the computation
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Using GPUs in a Big systematic DFT code
Nature of the operationsOperators approach via convolutions
Linear Algebra due to orthogonality of the basis
Communications and calculations do not interfere
* A number of operations which can be accelerated
Evaluating GPU convenienceThree levels of evaluation
1 Bare speedups: GPU kernels vs. CPU routinesDoes the operations are suitable for GPU?
2 Full code speedup on one processAmdahl’s law: are there hot-spot operations?
3 Speedup in a (massively?) parallel environmentThe MPI layer adds an extra level of complexity
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Hybrid and Heterogeneous runs with OpenCL
NVidia S2070 Connected each toa NehalemWorkstation
BigDFT may run onboth
ATI HD 6970
Sample BigDFT run: Graphene, 4 C atoms, 52 kpts
No. of Flop: 8.053 · 1012
MPI 1 1 4 1 4 8GPU NO NV NV ATI ATI NV + ATITime (s) 6020 300 160 347 197 109Speedup 1 20.07 37.62 17.35 30.55 55.23GFlop/s 1.34 26.84 50.33 23.2 40.87 73.87
Next Step: handling of Load (un)balancing
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Configuration space of the cage-like boron clusters
Stabilize the buckyball configuration of B80 systemsPRB 83 081403(R), 2011
-2
-1.5
-1
-0.5
0
0.5
-2
-1.5
-1
-0.5
0
0.5
B @B12
B 80
14:6
68
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Configuration space of the cage-like boron clusters
Stabilize the buckyball configuration of B80 systemsPRB 83 081403(R), 2011
-2
-1.5
-1
-0.5
0
0.5
-2
-1.5
-1
-0.5
0
0.5
-12.76 eV
-13.55 eV on B @B
12 68
@ B 80
8:12
Sc 3N
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Outline
1 The machinery of BigDFT (cubic scaling version)Mathematics of the waveletsMain OperationsMPI/OpenMP and GPU performances
2 O(N ) (linear scaling) BigDFT approachMinimal basis set approachPerformance and AccuracyFragment approachApplicationsPerspectives
3 Conclusions
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Scaling of BigDFT (I)
We can reach systems containing up to a few hundred electronsthanks to wavelet properties and efficient parallelization:
• MPI • OpenMP • GPU ported
But the various operations scale differently:
• O(N logN ): Poisson solver
• O(N 2): convolutions
• O(N 3): linear algebra
and have different prefactors:
• cO(N 3)� cO(N 2)� cO(N logN ) 0
10
20
30
40
50
60
0 1000 2000
Tim
e / i
tera
tion
(s)
Number of atoms
ax3 + bx2 + cx:2.8e-09; 8.4e-06; 4.1e-031.3e-10; 6.1e-07; 3.9e-035.7e-11; 6.1e-07; 4.9e-047.1e-11; 1.7e-11; 3.4e-032.1e-11; 1.2e-11; 3.8e-03
−→ for bigger systems the O(N 3) will dominate and so we need anew approach
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Scaling of BigDFT (II)
To improve the scaling and thus the scope of BigDFT, we must takeadvantage of the nearsightedness principle:
• the behaviour of large systems is short-ranged, ornear-sighted
• the density matrix decays exponentially in insulating systems
• can take advantage of this locality to create linear-scalingO(N ) DFT formalisms by having localized orbitals e.g. as inONETEP, Conquest, SIESTA...
• we want to adapt this formalism for BigDFT
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Minimal basis and density kernel
Write the KS orbitals as linearcombinations of minimal basisset φα(r):
Ψi (r) = ∑α
cαi φα(r)
• localized
• atom-centred
• expanded in wavelets
Define the density matrix ρ(r, r′)and kernel K αβ:
ρ(r, r′) = ∑i
∣∣Ψi (r)〉〈Ψi (r′)∣∣
= ∑α,β
∣∣∣φα(r)〉K αβ〈φβ(r′)∣∣∣
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
The algorithm
Initial Guess:Atomic Orbitals
Basis optimization:
Kernel Optimization:
Update potentialDiagonalize Hamiltonian
Forces
or
Minimal basis set• Initialized to atomic orbitals, optimized
using a quartic confining potential subjectto an orthogonality constraint
• Minimize the ‘trace’, band structureenergy or a combination at fixed potential
• SD or DIIS with k.e. preconditioning
Density kernelThree methods for self-consistent optimization:
• Diagonalization
• Direct minimization
• Fermi operator expansion (linear-scaling)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Scaling
0
10
20
30
40
50
60
0 1000 2000 3000 4000 5000 6000 7000 8000
Tim
e/itera
tion
(s)
Number of atoms
ax3
+ bx2
+ cx:1.0e+00; 3.0e+03; 1.5e+06
4.6e-02; 2.2e+02; 1.4e+06
7.6e-03; 4.3e-03; 1.4e+06
CubicDminFOE
2
4
6
8
10
12
0 4000 8000
Mem
ory
(GB
)
Number of atoms
Improved time and memory scaling• 301 MPI, 8 OpenMP for above results
• ∼ O (N ) for FOE
• need sparse matrix algebra for directminimisation→ O (N )
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Parallelization – MPI
0
5
10
15
20
0 500 1000 1500 2000 2500 3000 3500 4000 0 10 20 30 40 50 60 70 80 90 100
Spe
edup
wrt
160
cor
es
Effi
cien
cy (
%)
Number of cores (MPI tasks x 4 OMP threads)
SpeedupIdeal speedup
Amdahl’s law, P=0.97Efficiency
Efficient parallelization for 1000s of CPUs• ∼ 97% of code parallelized with MPI
• ∼ 93% of code parallelized with OpenMP
960 atomsLaboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Accuracy: Total energies and forces
Several systems 44 – 450 atoms• compared with standard BigDFT
• total energies accuracy: ∼ 1 meV/atom
• forces accuracy: ∼ a few meV/bohr
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Geometry optimization
1e−03
1e−02
1e−01
1e+00
RM
S F
(eV
/Å)
0
5
10
Tim
e (m
in.)
0
20
40
60
80
100
0 5 10 15
Cum
. tim
e (m
in.)
CubicReset
Reformat 0.00
0.05
0.10
0 5 10 15
∆ R
(Å
)
288 atoms
Further savings through support function reuse
• Pulay forces not needed • lower prefactor than single point
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Energy differences
Silicon cluster with a vacancy defect• 291 atoms
• need 9 support functions per Si
• accuracy in defect energy of 12 meV
pristine vacancy ∆ ∆-∆cubic
eV eV eV meVcubic −20674.2 −20563.1 111.167 –4/1 (sp-like) −20667.6 −20556.5 111.038 1299/1 (spd-like) −20672.9 −20561.7 111.155 12
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Fragment approach• By dividing a system into fragments,
we can avoid optimizing the supportfunctions entirely for large systems
• Substantially reduces the cost(need an efficient reformatting)
• Many useful applications, including theexplicit treatment of solvents
• A necessary first step towards atight-binding like approach, with eachatom as a fragment
Reformatting the minimal basis set in the same grid
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Support function reuse
As we have seen, the reuse of support functions lends itself tosome interesting applications, e.g. geometry optimizations,charged systemsFor this, we need an accurate and efficient reformatting scheme
0.001
0.01
0.1
1
10
100
E-
Ecubic
(meV
/ato
m)
Cubic eggboxLinear eggbox
Linear template
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Transfer integrals and site energies
0.00
0.20
0.40
0.60
0.80
1.00
0 30 60 90
Ene
rgy
(eV
)
Rotation angle
DZ J12 LUMOTZP J12 LUMOTMB J12 LUMO
0.00
0.20
0.40
0.60
0.80
1.00
0 30 60 90
Ene
rgy
(eV
)
Rotation angle
DZ J12 HOMOTZP J12 HOMOTMB J12 HOMO
−4.00
−3.50
−3.00
−2.50
−2.00
−1.50
0 30 60 90
Ene
rgy
(eV
)
Rotation angle
DZ e1,2 LUMOTZP e1,2 LUMOTMB e1,2 LUMO
−6.00
−5.50
−5.00
−4.50
−4.00
0 30 60 90E
nerg
y (e
V)
Rotation angle
DZ e1,2 HOMOTZP e1,2 HOMOTMB e1,2 HOMO
BigDFT compared to ADF fragment approach• support functions from molecules reused in dimers
• Application to OLED (Organic Light-Emitting Diode)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Transfer integrals for OLEDs (I)
We want to consider environment effects in realistic ‘host-guest’morphologies: 6192 atoms, 100 molecules
Host molecule
Guest molecule
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Transfer integrals for OLEDs (II)
Transfer integrals (left) and site energies (right) calculated with(bottom) and without (top) constrained DFT
-0.06 -0.04 -0.02 0 0.02 0.04 0.06Energy (eV)
all
host-host
guest-guest
host-guest-5.5 -5.4 -5.3 -5.2 -5.1 -5 -4.9 -4.8 -4.7 -4.6 -4.5
-7.3 -7.2 -7.1 -7 -6.9 -6.8 -6.7 -6.6 -6.5 -6.4 -6.3
Energy (eV)
hostguest
Generate good statistics.Then use of Markus model to calculate the efficiency of OLEDS.
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Perspectives of linear scaling
Combination of wavelets and localized basis functions• Highly accurate (energies and forces) and efficient
• Linear scaling – Thousands of atoms
• Ideal for massively parallel calculations
• Flexible e.g. reuse of support functions to accelerategeometry optimizations and for charged systems
Fragment approach• Accurate reformatting scheme – can accelerate calculations
on large systems e.g. explicit treatment of solvents
• Wide range of applications: constrained DFT, transfer integrals
• Crucial for future developments e.g. tight-binding, embeddedsystems (QM/QM)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Outline
1 The machinery of BigDFT (cubic scaling version)Mathematics of the waveletsMain OperationsMPI/OpenMP and GPU performances
2 O(N ) (linear scaling) BigDFT approachMinimal basis set approachPerformance and AccuracyFragment approachApplicationsPerspectives
3 Conclusions
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Conclusions
Wavelets: Powerful formalism• Linear scaling – Thousands of atoms
• Accurate reformatting scheme – can accelerate calculationson large systems e.g. explicit treatment of solvents
• Wide range of applications: constrained DFT, transfer integrals
• Towards multi-scale approach (embedded systems)
Beyond DFT• Resonant states are the way to describe fully a system
• Give a compact description (few states) of the systems
• Linear-response, TD-DFT, ...
• Many-body perturbation theory (GW, ...)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch
BigDFT
BigDFT
Wavelets
Operations
Performances
O(N )
Minimal basis
Performance
Fragment
Applications
Perspectives
Conclusions
Acknowledgments
Main maintainer Luigi Genovese
Group of Stefan Goedecker A. Ghazemi, A. Willand, S. De,A. Sadeghi, N. Dugan (Basel University)
Link with ABINIT and bindings Damien Caliste (CEA-Grenoble)
Order N methods S. Mohr, L. Ratcliff, P. Boulanger (Neel Institute,CNRS)
Exploring the PES N. Mousseau (Montreal University)
Resonant states A. Cerioni, A. Mirone (ESRF, Grenoble),I. Duchemin (CEA) M. Moriniere (CEA)
Optimized convolutions B. Videau (LIG, computer scientist,Grenoble)
Applications P. Pochet, E. Machado, S. Krishnan, D. Timerkaeva(CEA-Grenoble)
Laboratoire de Simulation Atomistique http://inac.cea.fr/L Sim Thierry Deutsch