Masanori NunamiNIFS JST/CRESTNIFS, JST/CREST
/17EM particle simulation for multiEM particle simulation for multi--scale phenomenascale phenomena 2
Plasma phenomena take place in multi-scale processes from the electron scale to the magnetohydrodynamic scale.For instanceFor instance,
Space plasmas・Electron scale (~101km)
Solar wind
Electron scale ( 10 km)・Magnetosphere scale (~105km)Magnetic confinement fusion plasma
Magnetosphere
Magnetic confinement fusion plasma・Electron scale (~10-5m)・Device scale (~100m) Uniform Grids
• Uniform spatial grid system is used.Ordinary Particle-In-Cell (PIC) method
p g y• Huge computer resources are required.
(example)105×105×105 grids ⇒ 10 000 PB memories are required!105×105×105 grids ⇒ 10,000 PB memories are required!
/17
Adaptive mesh refinement (AMR) techniqueAdaptive mesh refinement (AMR) technique3
AMR technique is one of the promising method which makes to realize high-resolution calculation saving computer resources. It subdivides the computational cells due to criterion
Berger and Oliger (1984), Berger and Colella (1989)
It subdivides the computational cells due to criterionlocally in space and dynamically in time.
Solar windg g ( ), g ( )Applied the multi-layer grid to partial differential equations.
Groth et al. (2000)MHD + AMR
Magnetosphere
Villumsen (1989), Kravtsov et al. (1997), Yahagi and Yoshii (2001)
PIC (+ N-body code) + AMR(example)105×105×105 gridsPIC ( N body code) AMR
Vay et al. (2004)Electrostatic PIC + AMR
Fujimoto and Machida (2006)
10% refine foreach direction
104×104×104 gridsRequired memories
We need the AMR EMWe need the AMR EM--PIC code which is available PIC code which is available
Fujimoto and Machida (2006)Electromagnetic PIC + AMR (with Poisson eq. )
Grid with AMR Required memories… 10 PB !!
for parallel computation .for parallel computation .
/17Developed AMR EMDeveloped AMR EM--PIC code “PARMER”PIC code “PARMER”((PARticlePARticle code with adaptive code with adaptive MEshMEsh Refinement technique)Refinement technique)
4
Adoption of the fully threaded tree (FTT) data structure
Features of the PARMER code
p y ( )Using pointer variables, we can build a very flexible cell hierarchy system which can be easily modified according to
i fi i ivarious refinement criterions.Particles are also incorporated in FTT with a beaded (linked) data structure using pointersdata structure using pointers.Use of local charge conservation schemeParallelization can be realized by domain decomposition scheme without solving Poisson equation.Modified Morton ordering methodFor dynamical load balancing in parallization, we use modifiedFor dynamical load balancing in parallization, we use modified Morton ordering scheme, concentrating on the number of particle loops.D l d i F t 90Developed in Fortran90
/17Fully Threaded Tree (FTT) Data StructureFully Threaded Tree (FTT) Data Structure
C ll d i d d i((KhokhlovKhokhlov, 1998), 1998)
5
Cells are treated as independent units(structure variable in Fortran90) organized in refinement tree rather than usual
neighbor parent child particle
elements of arrays, supporting by a set of pointers as
neighbor, parent, child, particle.Therefore, we can build a very flexible hierarchy cell system.And each cell has information about the cell’sAnd each cell has information about the cell s
The level of hierarchical structure,Spatial position at a corner of the cella corner of the cell,Physical data(electromagnetic fields, current density and so on).
/17Calculation Flow of PARMER codeCalculation Flow of PARMER code 6
START Pseudo code
Initial condition for particles and fields
program PARMERcall initial_conditiondo istep=1,Nstep
According to certain refinement criterion
!For refinement levelcall refinementdo level=LvMAX,LvMIN-1,-1
d i t 2 1 2
Calculation Loop
According to certain refinement criterion, refinement & unrefinement processes is done
with half mesh size & half time steps
do istep2=1,2call move_particlecall advance_field
enddo
Advance particles & fields from Max. level to Min of level using usual PIC scheme
enddocall interpolate_field
enddo!For base levelMin. of level using usual PIC scheme
END
call move_particlecall advance_field
enddodEND end program PARMER
/17
Test of refinement routineTest of refinement routine7
We demonstrated mesh refinement routine using the data of magnetosphere-solar wind interaction by hybrid PIC simulation
Magnetosphere - Solar wind interaction
solar wind interaction by hybrid PIC simulation.Ion density Generation & disappearance of
refinement meshes
/17Test calculation of PARMER(1/3)Test calculation of PARMER(1/3) 8
LPP expansion in homogeneous B-fieldAs a test calculation with adaptive
B
mesh refinement in our new PIC code, PARMER, we demonstrated expansion phenomena of sphericalexpansion phenomena of spherical laser produced plasma (LPP) in external homogeneous magnetic field.external homogeneous magnetic field.
We observed electron motion and compared the time evolution of particle energies by PARMER with ordinary PIC code which uses
LPP in B-field・Electrons are magnetized.・Ions are not magnetizedordinary PIC code which uses
uniform grid system.Ions are not magnetized.
⇒ Ions can get ahead of electron surface and generate inward radial E-Field.
/17
Ti l ti f l t d it
Test calculation of PARMER (2/3)Test calculation of PARMER (2/3) 9
Time evolution of electron densityWe observed the time evolution of electrondensity with adaptive mesh refinement
Inward E-field
process. As the refinement criterion, electron density exceeds or not a certain value α, i.e.,exceeds or not a certain value α, i.e.,And maximum refinement level = 2.Inward E-field causes ErxB drift motion of electrons.
Electron density Refined meshesElectron density Refined meshes
ErB○・
×BEr
/1710Test calculation of PARMER (3/3)Test calculation of PARMER (3/3)
Time evolution of particle energyResult of time evolutions of ion and electron energy from “PARMER” code is consistent with the result by ordinary PICwith the result by ordinary PIC code without AMR.
Therefore, we confirmed validity of the calculation with adaptive refinement technique by new code, “PARMER.”
/17Load balance in parallelization with domain decompositionLoad balance in parallelization with domain decomposition
I th t ith l fi t h b f ll
11
In the system with several refinement meshes, number of cells are dynamically changed according AMR working. Therefore, we need the parallelization scheme which can be synchronized with number of cells.Example
Distribution of plasma cluster
Dens
p
AMR
sity (A.U
.)
R
Uniformdomain decomposition
Load balance is broken!
/17Morton ordering (ZMorton ordering (Z--ordering) methodordering) methodF id ( k) t t h bi (E l ) 23×23 i 2D
12
From a grid (i, j, k), we extract each binary index with the order as (k, j, i, k, j, i, ...), and a new binary number L is generated. (Morton ordering) (Warren et al 1993)
(Example) 23×23 in 2D space(3,4) ⇒ (011,100)
(Morton ordering) (Warren et al.,1993)According to the number L, cells are re-sorted.(Therefore, all cells are one dimensionalized.)And then the grids are separated dividing by number of processors
L = 100101 = 37
And then, the grids are separated dividing by number of processors.In 1D space with Ncpu=2
Uniform separationMorton ordering
We can decompose calculation domain with almost same number of cells, even though the
t i l d l hi hi l hsystem includes several hierarchical meshes.
/17Modified Morton ordering methodModified Morton ordering methodI hi hi l t l l ti t i t i f th t l l
13
In hierarchical system, calculation cost increases twice of the parent level as approaching deep hierarchical level, because time step size also becomes half of the parent level, Δt → Δt/2. And the cost of particle calculation is extremely dominant in particle codes.Therefore the number of particle calculation loops should be balanced in parallel computationparallel computation. Assigned number of particle loops should beWe propose the “modified Morton ordering method” .p p g
Separate
Number of particles
Number of particle loops
/17
W t t d th difi d M t d i th d t i l f
14Test of modified Morton ordering method (1/3)Test of modified Morton ordering method (1/3)We tested the modified Morton ordering method to previous example of distributed plasma clusters. (Only separation test, no MPI communications.)In the modified Morton ordering, the boundary of decomposed domains dynamically changes as clusters move.
Plasma density Refinement meshes
In Ncpu=16
/17Test of modified Morton ordering method (2/3)Test of modified Morton ordering method (2/3)A d t t d th difi d M t d i th d t l f
15
And we tested the modified Morton ordering method to example of interaction between Solar-wind and Magnetosphere. (This is also test of the method and not include MPI communications.)
Solar wind
Magnetosphere
Level=1,2
Level=0,1,2 Level=2
In N =16In Ncpu=16
/17
I l M t d i th l f b f ti l
16Test of modified Morton ordering method (3/3)Test of modified Morton ordering method (3/3)In normal Morton ordering, there are large unevenness of number of particle calculation loops.However, in the modified Morton ordering, we can remove the unevenness.
/17To future workTo future work 17
Coding of base of AMR-PIC code “PARMER” is almost completed. For more concrete numerical evaluation on plasma physics we must to overcome following problems
More validation of PARMER codeUsing several examples we must confirm more validity of the PARMER
plasma physics, we must to overcome following problems.
Using several examples, we must confirm more validity of the PARMER code.Implementation of modified Morton ordering methodp gWe will soon complete the implementation of modified Morton ordering method for parallelization of PARMER using MPI technique.Efficient parallelization & improve itEfficient parallelization & improve itWe must evaluate efficiency of parallelization using modified Morton ordering method, and improve it.Application to concrete exampleIn next stage, we will apply the PARMER code to magneto plasma sail (MPS) development(MPS) development.
Th ENDTh ENDThe ENDThe END
Thank you very much for your kind attentionThank you very much for your kind attention.
18