real-time massively distributed multi-object adaptive ... · optics simulations for the european...
TRANSCRIPT
Real-Time Massively Distributed Multi-Object AdaptiveOptics Simulations for the European Extremely Large
Telescope
Hatem LtaiefSenior Research Scientist
Extreme Computing Research Center, KAUST
Intel eXtreme Performance Users Group ConferenceThuwal, Jeddah, KSA
April 22-25, 2018
H. Ltaief Extreme Adaptive Optics 1 / 50
Bringing Astronomy Back Home ;-)
Courtesy from CEMSE Communications, KAUST
H. Ltaief Extreme Adaptive Optics 2 / 50
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 3 / 50
Acknowledgments
Students/Collaborators
Extreme Computing Research Center @ KAUSTA. Charara and D. Keyes
L’Observatoire de Paris, LESIAR. Dembet, N. Doucet, E. Gendron, D. Gratadour, C. Morel,A. Sevin and F. Vidal
KAUST Supercomputer LabB. Hadri and S. Feki
Innovative Computing Laboratory @ UTKPLASMA/MAGMA/PaRSEC Teams
INRIA/INP Bordeaux, FranceRuntime/HiePACS Teams
H. Ltaief Extreme Adaptive Optics 4 / 50
Acknowledgments
Support/Funding
Funded partially by the French National Center for Scientific Research(CNRS, 2016)
Funded partially by the European Commission (Horizon 2020program, FET-HPC grant# 671662)
Intel Parallel Computing Center
KAUST/Cray Center of Excellence: Aniello Esposito and AdrianTate
For core-hours:
Shaheen 2.0 from KAUST Supercomputer LabCray KNL internal system
H. Ltaief Extreme Adaptive Optics 5 / 50
E-ELT
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 6 / 50
E-ELT
The World’s Biggest Eye on The Sky
Credits: ESO (http://www.eso.org/public/teles-instr/e-elt/)
H. Ltaief Extreme Adaptive Optics 7 / 50
E-ELT
The World’s Biggest Eye on The Sky
Credits: ESO (http://www.eso.org/public/teles-instr/e-elt/)
The largest optical/near-infrared telescope in the world.
It will weigh about 2700 tons with a main mirror diameter of 39m.
Location: Chile, South America.
H. Ltaief Extreme Adaptive Optics 8 / 50
E-ELT
The Top 10 (present and future) Radio and OpticalGround-based Telescopes
Rank Name Location Diameter Cost Year10 Large Synoptic Survey Telescope (LSST) Chile 8.4m 450 million 20149 South African Large Telescope (SALT) South Africa 9.2m 36 million 20058 Keck USA 10m 100 million 19967 Gran Telescopio Canarias (GTC) Spain 10.4m 130 million 20096 Aricebo Observatory Puerto Rico 305m 9.3 million 19635 Atacama Large Millimeter Array (ALMA) Chile 12m 1.4 billion 20134 Giant Magellan Telescope (GMT) Chile 24.5m 2.2 billion 20243 Thirty Meter Telescope (TMT) USA 30m 1.4 billion 20302 Square Kilometer Array (SKA) Australia 90m 2 billion 20201 European Extremely Large Telescope (E-ELT) Chile 39m 1.3 billion 2024
Consortium: multiple nation initiatives
Src: http://www.space.com/14075-10-biggest-telescopes-earth-comparison.html
H. Ltaief Extreme Adaptive Optics 9 / 50
MOSAIC
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 10 / 50
MOSAIC
MOSAIC
Represents the most challenging embedded instrument in the E-ELT.
Observes the most distant galaxies in parallel to understand theirevolution with a high multiplex
Exploits a Field of View (FoV) of 7 to 10 arcminutes (1/4 of the fullmoon) with a resolution of few tens of milli-arcsec (1/20,000 of thefull moon)
Uses multiple guide stars (tomographic measurement of theturbulence) and multiple deformable mirrors (direction specificcompensation).
H. Ltaief Extreme Adaptive Optics 11 / 50
MOSAIC
The Effect of Atmospheric Turbulence
The sun observed with a compact camera
Disturbs the trajectory of light rays (wavefront perturbations)
Reduces astronomical images quality
H. Ltaief Extreme Adaptive Optics 12 / 50
MOSAIC
Adaptive optics (AO)
AO: technique used compensate in real-time the wavefrontperturbations providing a significant improvement in resolution
The moon observed with a 8m telescope (left: no AO, right: with AO)
H. Ltaief Extreme Adaptive Optics 13 / 50
MOSAIC
How AO works
H. Ltaief Extreme Adaptive Optics 14 / 50
MOSAIC
Multi-object Adaptive Optics (MOAO)
Credits: ESO
Good news: Extremely compute intensive at full telescope scale
H. Ltaief Extreme Adaptive Optics 15 / 50
MOSAIC
Multi-object Adaptive Optics (MOAO)
Credits: ESOGood news: Extremely compute intensive at full telescope scale
H. Ltaief Extreme Adaptive Optics 15 / 50
Simulation
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 16 / 50
Simulation
One of The Key Operations: Tomographic reconstructor
Compute the tomographic reconstructor matrix using covariancematrix between direction specific truth sensor and other sensors andthe inverse of measurements covariance matrix
R ′ = Ctm · C−1mm
Factorize and Solve for R ′ with Cmm, a 100k x 100k matrix, isextremely compute intensive
At the core of system operations (soft real-time, should be achievedin seconds to update the real-time box)
Also a critical component for the numerical simulation of the systembehavior (observation forecast for today’s design studies)
H. Ltaief Extreme Adaptive Optics 17 / 50
Simulation
AO simulation pipeline: observations forecast
From system parameters and turbulence evolution timeline to imagequality at the output of the system
H. Ltaief Extreme Adaptive Optics 18 / 50
Simulation
Global Workflow Chart
SystemParameters
TurbulenceParameters
matcov Cmm Ctm ToR
matcov Cmm Ctm
Ctt
Cee CvvBLAS BLAS
Inter-sample
R
ToR computation
Observing sequence
H. Ltaief Extreme Adaptive Optics 19 / 50
Simulation
Global Workflow Chart
SystemParameters
TurbulenceParameters
matcov Cmm Ctm ToR
matcov Cmm Ctm
Ctt
Cee CvvBLAS BLAS
Inter-sample
R
ToR computation
Observing sequence
H. Ltaief Extreme Adaptive Optics 20 / 50
Simulation
Global Workflow Chart
SystemParameters
TurbulenceParameters
matcov Cmm Ctm ToR
matcov Cmm Ctm
Ctt
Cee CvvBLAS BLAS
Inter-sample
R
ToR computation
Observing sequence
H. Ltaief Extreme Adaptive Optics 21 / 50
Simulation
Observing sequence (close to real-time)
ToR computation (outer loop)
A serie of Level 3 BLAS operations (inner loop): GEMM, SYR2K,SYMM, etc.
H. Ltaief Extreme Adaptive Optics 22 / 50
Results
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 23 / 50
Results
Environment Settings - Shaheen 2.0
Compute Node x 6174
H. Ltaief Extreme Adaptive Optics 24 / 50
Results
Environment Settings - Shaheen 2.0
Hardware and software description:
CPU system (22nm)
Cray XC40 ≈ 5Pflops/s LINPACKIntel(R) Xeon(R) CPU E5-2699 v3 @ 2.3GHz (32 Haswell cores)128 GB of DDR4 main memoryAries Interconnect
Software
GCC compiler suiteIntel MKL 2018.1.163
H. Ltaief Extreme Adaptive Optics 25 / 50
Results
E-ELT Application Apparatus
Diameter telescope: 39m
Number of measurements of the true sensor: 10240
Number of actuators: 5336
Number of science channels (i.e., #galaxies): 16, 32 and 64
Performance of the ToR computation
H. Ltaief Extreme Adaptive Optics 26 / 50
Results
ScaLAPACK 2D Block-Cyclic Data Distribution
Processes Grid Logical View (Matrix) Local View (CPUs)
H. Ltaief Extreme Adaptive Optics 27 / 50
Results
AO simulation pipeline: observations forecast
Guide star positions
arcs
ec
arcsec
100
200
300
0
-100
-200
-3000 100 200 300-100-200-300
0-150 300-300-300
-150
0
300
150
150
Short Exposure PSF
arcs
ec
arcsec
Alt
itud
e in
km
Turbulence profiles
Figure: Overview of the simulation input parameters and output short and longexposure images for a single science channel and during a full night.
H. Ltaief Extreme Adaptive Optics 28 / 50
Results
High Resolution Galaxy Maps
Goal is to produce image quality maps over the instrument’s full FoVdepending on turbulence conditions evolution over the night
H. Ltaief et. al, Real-Time Massively Distributed Multi-Object Adaptive OpticsSimulations for the European Extremely Large Telescope, IPDPS’18, Vancouver, BCCanada H. Ltaief Extreme Adaptive Optics 29 / 50
Results
AO simulation pipeline: observations forecast
2048
4096
8192
16384
32768
65536
32 64 128 256 512 1024 2048 4096 6144
GFlops/s
Nodes
WFS-14
WFS-10
WFS-7
(a) Strong scalability.
1024
2048
4096
8192
16384
32768
32 64 128 256 512 1024 2048 4096 6144
Tim
e (s
)
Nodes
WFS-14
WFS-10
WFS-7
(b) Time to solution.
Figure: Strong scaling and time to solution of the large scale MOAO simulationup to 6144 nodes (196, 608 cores) on Shaheen-2 system.
H. Ltaief et. al, Real-Time Massively Distributed Multi-Object Adaptive OpticsSimulations for the European Extremely Large Telescope, IPDPS’18, Vancouver, BCCanada
H. Ltaief Extreme Adaptive Optics 30 / 50
Results
Performance Bottlenecks
0
5000
10000
15000
20000
25000
32 64 128 256 512 1024 2048 4096 6144
Tim
e (s
)
Nodes
MatCov
TOR
CeeCvv
Figure: Time breakdown of WFS-14.
H. Ltaief et. al, Real-Time Massively Distributed Multi-Object Adaptive OpticsSimulations for the European Extremely Large Telescope, IPDPS’18, Vancouver, BCCanada
H. Ltaief Extreme Adaptive Optics 31 / 50
Moving Forward with Taskification
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 32 / 50
Moving Forward with Taskification
Global Workflow Chart
SystemParameters
TurbulenceParameters
matcov Cmm Ctm ToR
matcov Cmm Ctm
Ctt
Cee CvvBLAS BLAS
Inter-sample
R
ToR computation
Observing sequence
H. Ltaief Extreme Adaptive Optics 33 / 50
Moving Forward with Taskification
Blocked Algorithms: Fork-Join Paradigm
H. Ltaief Extreme Adaptive Optics 34 / 50
Moving Forward with Taskification
ScaLAPACK: Blocked Algorithms
Principles:
Panel-Update sequence
Transformations are blocked/accumulated within the PanelLevel-2 BLAS
Transformations applied at once on the trailing submatrixLevel-3 BLAS
Parallelism hidden inside the BLAS
Fork-join model
A broken model!
H. Ltaief Extreme Adaptive Optics 35 / 50
Moving Forward with Taskification
Tile Data Layout Format
LAPACK: column-major format PLASMA/CHAMELEON: tile format
H. Ltaief Extreme Adaptive Optics 36 / 50
Moving Forward with Taskification
PLASMA/CHAMELEON: Tile Algorithms
PLASMA =⇒ http://icl.cs.utk.edu/plasma/CHAMELEON =⇒ https://gitlab.inria.fr/solverstack/chameleon.git
Break the bulk synchronous programming model
Parallelism is brought to the fore
May require the redesign of linear algebra algorithms
Tile data layout translation
Remove unnecessary synchronization points between Panel-Updatesequences
DAG execution where nodes represent tasks and edges definedependencies between them
Default dynamic runtime system environment StarPU (but could useQuark, PaRSEC, OmpSs, OpenMP etc.)
H. Ltaief Extreme Adaptive Optics 37 / 50
Moving Forward with Taskification
StarPU Runtime System 101
Provides:=⇒ Task scheduling=⇒ Memory management=⇒ Out-of-core
Supports:=⇒ SMP/Multicore Processors(x86, PPC, . . . )=⇒ NVIDIA GPUs (e.g.,multi-GPU)=⇒ Hybrid architectures=⇒ Shared andDistributed-memory
H. Ltaief Extreme Adaptive Optics 38 / 50
Moving Forward with Taskification
StarPU Runtime System: User Productivity!
starpu_Insert_Task(&cl_dpotrf,
VALUE, &uplo, sizeof(char),
VALUE, &n, sizeof(int),
INOUT, Ahandle(k, k),
VALUE, &lda, sizeof(int),
OUTPUT, &info, sizeof(int)
CALLBACK, profiling?cl_dpotrf_callback:NULL, NULL,
0);
H. Ltaief Extreme Adaptive Optics 39 / 50
Moving Forward with Taskification
Global Workflow Chart
SystemParameters
TurbulenceParameters
matcov Cmm Ctm ToR
matcov Cmm Ctm
Ctt
Cee CvvBLAS BLAS
Inter-sample
R
ToR computation
Observing sequence
H. Ltaief Extreme Adaptive Optics 40 / 50
Moving Forward with Taskification
Directed Acyclic Graph for MOAO
1:14 matcov
1
1
11
2
2
22
3
3
33
2:5
matcov
1
1
11
2
2
22
3
3
33
matcov
1
1
11
2
2
22
3
3
33
matcov
1
1
11
2
2
22
3
3
33
matcov
1
matcov
1
matcov
1
matcov
1
1
matcov
matcov
1
1
matcov
matcov
1
1
matcov
matcov
1
1
matcov
1
1
3:1
4:2
5:2
16:2
1
1 1
1
7:2
1 matcov8:4
1matcov9:7
matcov10:12 11 11
1
1
2
1
1
11
1
1
2
1
1 11
11
2
1
1 11
11
2
11:9
matcov
matcov
matcov
matcov matcov matcov
12:10
13:10
14:7
matcovmatcovmatcovmatcov
matcov
2
matcov
matcov
15:3
16:9
17:13
18:16
19:9
2
2
2
matcov
2
2
matcovmatcov
2
2
matcov
2
2
2
2
2 2
2
2 matcov
2matcov
matcov 22 22
2
2
3
2
2
22
2
2
3
2
2 22
2
3
2
2 22
22
3
matcov
matcov
matcov
20:10
21:10
22:7
matcovmatcovmatcovmatcov
matcov
3
matcov
matcov
23:3
24:9
25:12
26:14
27:7
3
3
3
matcov
3
3
matcovmatcov
3
3
matcov
3
3
3
3
3 3
3
3 matcov
3matcov
matcov 33 33
3
3
3
3
33
3
3
3
3 33
3
3
3 33
33
28:4
29:4
30:3
31:1
32:4
33:6
34:4
Figure: DAG representation of the overall MOAO framework execution for thecalculation of three ToR (i.e., three iterations of the outer loop) and a singleobserving sequence (i.e., one iteration in the inner loop) on a two-by-two tilematrix. H. Ltaief Extreme Adaptive Optics 41 / 50
Moving Forward with Taskification
Directed Acyclic Graph for MOAO
Figure: DAG representation of the overall MOAO framework execution for thecalculation of three ToR (i.e., three iterations of the outer loop) and a singleobserving sequence (i.e., one iteration in the inner loop) on a two-by-two tilematrix. H. Ltaief Extreme Adaptive Optics 42 / 50
Moving Forward with Taskification
Zooming in...
H. Ltaief Extreme Adaptive Optics 43 / 50
Moving Forward with Taskification
MOAO Software Release – https://github.com/ecrc/moao
A collaboration of With support from Sponsored by
CentrederechercheBORDEAUX–SUD-OUEST
Place your text here A HIGH PEFORMANCE MULTI-OBJECT ADAPTIVE OPTICS FRAMEWORK FOR GROUND-BASED ASTRONOMY
The Multi-Object Adaptive Optics (MOAO) framework provides a comprehensive testbed for high performance computational astronomy. In particular, the European Extremely Large Telescope (E-ELT) is one of today’s most challenging projects in ground-based astronomy and will make use of a MOAO instrument based on turbulence tomography. The MOAO framework uses a novel compute-intensive pseudo-analytical approach to achieve close to real-time data processing on manycore architectures. The scientific goal of the MOAO simulation package is to dimension future E-ELT instruments and to assess the qualitative performance of tomographic reconstruction of the atmospheric turbulence on real datasets.
DOWNLOAD THE SOFTWARE AT h6p://github.com/ecrc/moao
THE MULTI-OBJECT ADAPTIVE OPTICS TECHNIQUE
Single conjugate AO concept Open-Loop tomography concept Observing the GOODS South cosmological field with MOAO
MOAO 0.1.0 • Tomographic Reconstructor Computation • Dimensioning of Future Instruments • Real Datasets • Single and Double Precisions • Shared-Memory Systems • Task-based Programming Models • Dynamic Runtime Systems • Hardware Accelerators
CURRENT RESEARCH • Distributed-Memory Systems • Hierarchical Matrix Compression • Machine Learning for Atmospheric Turbulence • High Resolution Galaxy Map Generation • Extend to other ground-based telescope projects
PERFORMANCE RESULTS TOMOGRAPHIC RECONSTRUCTOR COMPUTATION – DOUBLE PRECISION
High res. map of the quality of turbulence compensation obtained with MOAO on a cosmological field
THE PSEUDO-ANALYTICAL APPROACH
SystemParameters
TurbulenceParameters
matcov Cmm Ctm ToR
matcov Cmm Ctm
Ctt
Cee CvvBLAS BLAS
Inter-sample
R
ToR computation
Observing sequence
• Compute the tomographic error: Cee = Ctt - Ctm RT – R Ctm
T + R Cmm RT • Compute the equivalent phase map:
Cvv = D Cee DT • Generate the point spread function image
Two-socket 18-core Intel HSW – 64-core Intel KNL – 8 NVIDIA GPU P100s (DGX-1)
• Solve for the tomographic reconstructor R: R x Cmm = Ctm
This is one tomographic reconstructor every 25
seconds!
0
5
10
15
20
25
30
35
40
45
10000 20000 30000 40000 50000 60000 70000 80000 90000 100000110000
TFlo
ps/s
matrix size
DGX-1 peakDGX-1 perf
KNL perfHaswell perf
0
100
200
300
400
500
600
700
10000 20000 30000 40000 50000 60000 70000 80000 90000 100000110000
time(
s)
matrix size
DGX-1KNL
Haswell
H. Ltaief Extreme Adaptive Optics 44 / 50
Summary and Future Work
Outline
1 The European Extremely Large Telescope
2 The MOSAIC Instrument
3 Numerical Simulation
4 Experimental Results
5 Moving Forward with Taskification
6 Summary and Future Work
H. Ltaief Extreme Adaptive Optics 45 / 50
Summary and Future Work
Summary
Adaptive optics require compute-intensive hardware
Pipelining computational stages
Efficient Task-based programming model
MOAO effort for standardization (w/ O. Guyon from Subarutelescope)
Machine learning for the control matrix (e.g., maximum likelihood)
H. Ltaief Extreme Adaptive Optics 46 / 50
Summary and Future Work
Exciting Time for Astronomy at KAUST/ECRC!
Supporting two major worldwide ground-based astronomy efforts
H. Ltaief Extreme Adaptive Optics 47 / 50
Summary and Future Work
Future Work
we need more (smarter) DLA (echoing A. Gara)
We are not cheating by doing approximation! (echoing A. Heinecke)
H. Ltaief Extreme Adaptive Optics 48 / 50
Summary and Future Work
The Hourglass Revisited
@KAUST_ECRC
https://www.facebook.com/ecrckaust
H. Ltaief Extreme Adaptive Optics 49 / 50
Summary and Future Work
Questions?
H. Ltaief Extreme Adaptive Optics 50 / 50