![Page 1: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/1.jpg)
THE SKA IMAGING AND CALIBRATION
CHALLENGE
Bojan Nikolic
SKA Science Data Processor Project Engineer
Principal Research Associate
Astrophysics Group, Cavendish Laboratory
University of Cambridge
Delivering SKA Science:
![Page 2: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/2.jpg)
SKA Context Diagram
SKA1 Low:Low Frequency Aperture Array
SKA1 Mid:Dish Antennas with Single-Pixel feeds
LFAA Correlator/
Beam Former
Science Data Processor
Implementation(Australia)
Science Data Processor
Implementation(South Africa)
SKA1 Mid Correlator/
Beam Former
Pulsar Search
Processor(South Africa)
Monitor and Control
SKA Regional Centres
Pulsar Search Processor(Australia)
These are off-
site! (In Perth &
Cape Town)
![Page 3: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/3.jpg)
Imaging and Calibration Context
![Page 4: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/4.jpg)
IMAGING AND CALIBRATION
PROCESSING IS A MAJOR PART
OF THE SKA BY DESIGN
![Page 5: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/5.jpg)
Large “D” – vs – Large “N”
GBT 100-m diameter telescope SKA LFAA prototype array
No 1 aim: collect as many photons as possible -> high sensitivity
No 2 aim: collect radiation from different directions -> high survey speed
No 3 aim: maximum separation of collectors -> high angular resolution
![Page 6: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/6.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Unfeasible to move off-continent
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
![Page 7: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/7.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Unfeasible to move off-continent
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
![Page 8: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/8.jpg)
High data rate for imaging
Direct consequence of:
• Large field of view, fast survey speed (small D)
• High angular resolution (long B)
• High continuum sensitivity (large bandwidth)
• Good sampling (large N)
• Mechanical engineering constraints (SKA1-mid)
-> 0.5 TB/s for each of the telescopes
100000x ALMA sustained data rate
10000x ALMA maximum data rate
1000x JVLA maximum data rate
![Page 9: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/9.jpg)
SDP Design Phase approach
• Receive, temporarily store incoming data
• Fairly demanding network but in principle
can be done today
• Key challenge is:
– Where to put the data, how to organise it
– How to process the data
![Page 10: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/10.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Unfeasible to move off-continent
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
![Page 11: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/11.jpg)
Imaging and calibration algorithmic
requirements
• Time- and frequency-variable corruption of the incoming signal:
– Atmospheric, mechanical & electronic causes
– Requires iteratively solving for Sky and the corrupting effects – “Self Calibration”
• Irregular, non-uniform sampling of measurements
– Requires (typically iterative) de-convolution – CLEAN, Wavelets, compressed sensing, etc
• Non-planar distribution of measurements
– Approximate correction to the plane required if want to use 2D FFTs
![Page 12: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/12.jpg)
Measurements are imperfect – corrupted by slowly changing
mechanical, electrical & atmospheric effects
Uncalibrated“Offset”
CalibrationRick Perley & Oleg Smirnov: “High Dynamic Range Imaging”,
www.astron.nl/gerfeest/presentations/perley.pdf
![Page 13: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/13.jpg)
Iterative & joint solving for the image of the Sky
& Calibration
“Self-Calibration”“closure –error”
calibrationRick Perley & Oleg Smirnov: “High Dynamic Range Imaging”,
www.astron.nl/gerfeest/presentations/perley.pdf
![Page 14: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/14.jpg)
SKA/SDP Approach
SKA/SDP Design:
• To support current best-practice algorithms:
• Multi-frequency multi-scale CLEAN
• Self-calibration
• Direction dependent correction using “A” terms
• Flexibility to update and improve in future
Important role for ongoing current research and future optimisation and commissioning
![Page 15: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/15.jpg)
SKA/SDP Approach
SKA/SDP Design:
• To support current best-practice algorithms:
• Multi-frequency multi-scale CLEAN
• Self-calibration
• Direction dependent correction using “A” terms
• Flexibility to update and improve in future
Important role for ongoing current research and future optimisation and commissioning
Challenge: Can these algorithms be expressed
scalably?
Need >1000x improvement from current proven
scales
Challenge: too much flexibility
– nothing ever works
![Page 16: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/16.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
![Page 17: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/17.jpg)
Illustrative Computing Requirements
• ~100 PetaFLOPS total achieved
• ~200 PetaByte/s aggregate BW to fast
working memory
• ~50 PetaByte Storage
• ~1 TeraByte/s sustained write to storage
• ~10 TeraByte/s sustained read from
storage
– ~~ 10000 FLOPS/byte read from storage
17
![Page 18: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/18.jpg)
Illustrative Computing Requirements
• ~100 PetaFLOPS total achieved
• ~200 PetaByte/s aggregate BW to fast
working memory
• ~50 PetaByte Storage
• ~1 TeraByte/s sustained write to storage
• ~10 TeraByte/s sustained read from
storage
– ~~ 10000 FLOPS/byte read from storage
18
Likely to be achievable
~ 2020
One of the big
challenges
Also likely to be
achievable well ahead
of SDP roll out
One of the big
challenges
![Page 19: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/19.jpg)
Parametric Model Example
Computational cost of a
transient survey as a
function of integration of
each pointing and
maximum baseline length
that is used
![Page 20: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/20.jpg)
Computational requirements breakdown
![Page 21: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/21.jpg)
SDP Design-Phase Approach
• Document the computational requirements, their relationship to the SDP requirements
• Document the roadmap for likely evolution of computing systems
• Ensure the SDP software architecture can make reasonably efficient use of likely future computing system
• Ensure the maintenance of software is tractable, especially across changes in future computing system architectures
• Prototyping to provide evidential support to the above, demonstrate appropriate technical readiness of potential solutions
![Page 22: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/22.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
High Degree of
Parallelism, automatic
unsupervised pipelines
Good Models,
Simulations essential,
Early science planning
Critical learning period
during commissioning
and early operations
SDP
![Page 23: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/23.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
High Degree of
Parallelism, automatic
unsupervised pipelines
Good Models,
Simulations essential,
Early science planning
Critical learning period
during commissioning
and early operations
SDP
![Page 24: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/24.jpg)
SDP Top-level Components & Key Performance
Requirements -- SKA Phase 1
SDP Local Monitor & Control
High Performance
• ~100 PetaFLOPS
Data Intensive
• ~100 PetaBytes/observation (job)
Partially real-time
• ~10s response time
Partially iterative
• ~10 iterations/job (~3 hour)
Telescope Manager
C
S
P
Regio
nal C
entre
s &
Astro
nom
ers
High Volume & High Growth Rate
• ~100 PetaByte/year
Infrequent Access
• ~few times/year max
Data Processor Data
Preservation
Delivery
System
Data Distribution
•~100 PetaByte/year from Cape Town & Perth to rest of World
Data Discovery
•Visualisation of 100k by 100k by 100k voxel cubes
Science Data Processor
1 Tera
Byte/s
![Page 25: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/25.jpg)
SDP Top-level Components & Key Performance
Requirements -- SKA Phase 1
SDP Local Monitor & Control
High Performance
• ~100 PetaFLOPS
Data Intensive
• ~100 PetaBytes/observation (job)
Partially real-time
• ~10s response time
Partially iterative
• ~10 iterations/job (~3 hour)
Telescope Manager
C
S
P
Regio
nal C
entre
s &
Astro
nom
ers
High Volume & High Growth Rate
•~100 PetaByte/year
Infrequent Access
•~few times/year max
Data Processor Data
Preservation
Delivery
System
Data Distribution
•~100 PetaByte/year from Cape Town & Perth to rest of World
Data Discovery
•Visualisation of 100k by 100k by 100k voxel cubes
Science Data Processor
1 Tera
Byte/s
Goal is to extract
information from data
and then discard the
data
![Page 26: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/26.jpg)
Programming model
• Hybrid programming model:– Dataflow at coarse-grained level:
• About 1 million tasks/s max over the whole processor (-> ~10s – 100s milli second tasks), consuming ~100 MegaByte each
• Static scheduling at coarsest-level (down to “data-island”)– Static partitioning of the large-volume input data
• Dynamic scheduling within data island:– Failure recovery, dynamic load-balancing
• Data driven (all data will be used)
– Shared memory model at fine-grained level e.g.: threads/OpenMP/SIMT-like
• ~100s active threads per shared memory space
• Allows manageable working memory size, computational efficiency
26
![Page 27: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/27.jpg)
Challenge: Unsupervised pipelines and processing
• Extremely challenging to deliver early in
operations
• Very challenging to deliver for a diverse
set of science programmes and goals
• Unsatisfactory performance will lead to low
observatory efficiencies
![Page 28: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/28.jpg)
Factors driving the SKA challenge
Very high data rate in
•Unfeasible to permanently store
•Expensive to store even temporarily
High computational requirements to process
•Capital and operational expense
•Hardware/software failures rare for individual computers become frequent
Optimal processing strategy, algorithms and parameters unknown:
•Will not be known until the telescope begins operations
•Will depend in part on science goals and demands of individual projects
High Degree of
Parallelism, automatic
unsupervised pipelines
Good Models,
Simulations essential
Early Science Planning
Critical learning period
during commissioning
and early operations
SDP
Design Phase
SDP
![Page 29: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/29.jpg)
Long term commissioning and optimisation
Example from ALMA
First Results from High
Angular Resolution ALMA
Observations Toward the
HL Tau Region,
ALMA Partnership,
2015ApJ...808L...3A
Result of collaboration of
observatory staff,
institutes and universities
to characterise and
commission ALMA long
baselines
![Page 30: THE SKA IMAGING AND CALIBRATION CHALLENGE · Requirements --SKA Phase 1 SDP Local Monitor & Control High Performance •~100 PetaFLOPS Data Intensive •~100 PetaBytes/observation](https://reader035.vdocuments.site/reader035/viewer/2022071101/5fdae47f1a782210b720e39e/html5/thumbnails/30.jpg)
END