![Page 1: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/1.jpg)
The SKA
Science Data Processor
and Regional Centres
Paul Alexander
Leader the Science Data Processor Consortium
![Page 2: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/2.jpg)
SKA: A Leading Big Data Challenge
for 2020 decade
Antennas
Digital Signal
Processing (DSP)
High Performance
Computing Facility (HPC)
Transfer antennas to DSP
2020: 5,000 PBytes/day
2030: 100,000 PBytes/day
Over 10’s to 1000’s kms
To Process in HPC
2020: 50 PBytes/day
2030: 10,000 PBytes/day
Over 10’s to 1000’s kms
HPC Processing
2023: 250 PFlop
2033+: 25 EFlop
![Page 3: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/3.jpg)
X X X X X X
SKY Image
Detect &
amplify
Digitise &
delay
Correlate
Process Calibrate, grid, FFT
Integrate
s
B
B . s
1 2
Astronomical signal
(EM wave) • Visibility:
V(B) = E1 E2*
= I(s) exp( i w B.s/c )
• Resolution determined by
maximum baseline
qmax ~ l / Bmax
• Field of View (FoV) determined
by the size of each dish
qdish ~ l / D
Standard interferometer
![Page 4: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/4.jpg)
Image formation computationally intensive
• Images formed by an inverse
algorithm similar to that used
in MRI
• Typical image sizes up to:
• 30k x 30k x 64k voxels
![Page 5: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/5.jpg)
Work Flows: Imaging Processing Model
Correlator
RFI excision
and phase
rotation
Subtract current sky
model from visibilities using current calibration model
Grid UV data to form e.g.
W-projection
Major cycle
Image gridded data
Deconvolve imaged data
(minor cycle)
Solve for telescope and
image-plane calibration model
Update
current sky model
Update
calibration model
Astronomical
quality data
UV data store
UV processors
Imaging processors
![Page 6: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/6.jpg)
One SDP Two Telescopes
Ingest (GB/s)
SKA1_Low 500
SKA1_Mid 1000
In total need to deploy
eventually a system which
is close to 0.25 EFlop of
processing
![Page 7: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/7.jpg)
Scope of the SDP
![Page 8: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/8.jpg)
The SDP System
Multiple instances of
the process data
component, some real
time some batch
1…N
![Page 9: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/9.jpg)
Illustrative Computing Requirements
• ~250 PetaFLOP system
• ~200 PetaByte/s aggregate BW to fast working memory
• ~80 PetaByte Storage
• ~1 TeraByte/s sustained write to storage
• ~10 TeraByte/s sustained read from storage
• ~ 10000 FLOPS/byte read from storage
• ~2 Bytes/Flop memory bandwidth
9
![Page 10: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/10.jpg)
SDP High-Level Architecture
![Page 11: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/11.jpg)
What does SDP do?
SDP is coupled to rest of the
telescope
Try to make the coupling as
loose as possible, but some
time critical aspects
For each observation:• Controlled by a scheduling
block
• Run a Real time (RT)
process to ingest data and
feed back information
• Schedule a batch processing
for later
• Must manage resources to
SDP keeps up on timesacle
of approximately a week
![Page 12: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/12.jpg)
Real-time activity
![Page 13: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/13.jpg)
Batch activity
![Page 14: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/14.jpg)
SDP Architecture and Architectural Drivers
• How do we achieve this functionality
• Architecture to deliver at scale
• Some key considerations
– SDP is required to evolve over time to support changing work flows required to deliver the science
– Different Science experiments have different computational costs and resource demands
• Use for load balancing
• Expect evolution of system to more demanding experiments during the operational lifetime
![Page 15: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/15.jpg)
Data Driven Architecture
![Page 16: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/16.jpg)
Data Driven Architecture
o Smaller FFT size at cost of data duplication
![Page 17: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/17.jpg)
Data Driven Architecture
o Further data parallelism in spatial indexing (UVW-space)
o Use to balance memory bandwidth per node
o Some overlap regions on target grids needed
![Page 18: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/18.jpg)
Data Driven Architecture
![Page 19: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/19.jpg)
How do we get performance
and manage data volume?
Approach: Build on BigData Concepts
"data driven” graph-based processing approachreceiving a lot of attention
Inspired by Hadoop but for our complex data flow
Graph-based approach
Hadoop
![Page 20: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/20.jpg)
Execution Engine: hierarchical
Processing Level 1
Cluster and data
distribution
controller
Relatively low-data
rate and cadence
of messaging
Staging:
aggregation of
data products
Processing level 2
Static data
distribution
exploiting inherent
parallelism
![Page 21: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/21.jpg)
Execution Engine: hierarchical
Processing Level 2
Data Island
Shared file
store across
data island
Worker
nodes
Task-level
parallelism to
achieve scaling
Process
controller
(duplicated for
resilience)
Cluster manager
e.g. Mesos-like
![Page 22: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/22.jpg)
Data flow for an observation
![Page 23: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/23.jpg)
Execution Framework
![Page 24: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/24.jpg)
Execution Framework
![Page 25: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/25.jpg)
Execution Framework
![Page 26: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/26.jpg)
Execution Framework
![Page 27: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/27.jpg)
Execution Framework
![Page 28: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/28.jpg)
What do we need
• Processing Level 1:– Custom framework to provide scale out
• Processing Level 2:– Many similarities to Big-Data frameworks
– Need to support multiple execution frameworks• Streaming processing – possibly Apache STORM
• Work flows with relatively large task size or where data-driven load balancing is straight forward – possibly development of prototype code
• ”Big Data” workflows with load balancing, resilience – possibly something like Apache SPARK
– Framework manages tasks• Tasks need to be pure functions
• Most tasks domain specific, but need supporting infrastructure– Processing libraries
– Interfaces to the execution framework
– Performance tuning on target hardware
![Page 29: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/29.jpg)
What do we need
• Skills and work – e.g. for SPARK– Need to develop for High Throughput
• High Throughput data analytics framework (HTDAF)
• New data models
• Links to external processes
• Memory management between framework and processes
– Likely to have significant applicability
– Shared file system needs to be very intelligent about data placement / management or by tightly coupled with the HTDAF
– Integration of in-memory file system / object store systems with the cluster file system / object store
– Software development in data analytics, distributed processing systems, JVM, …
![Page 30: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/30.jpg)
Database services
• Tasks running within processing framework need metadata and to communicate updates / results
– High performance distributed database
– Prototyping using REDIS
– Needs to interface to processing requirements
– Schemas
![Page 31: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/31.jpg)
File / object store
• Multiple file / object stores,
one per “data island”
– Integration into execution
framework
– Resource allocation
integrated with cluster
management system
– Complex resource
management
– Migration / interface to
long-term persistent store
likely to be separate
Hierarchical Storage
Manager
![Page 32: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/32.jpg)
Data lifecycle management
Manage data lifecycle across the cluster and through time
• Strong link to overall processing and observing schedule
• Performance is key
• Determine initial data placements
• Manage data movement and migration
• Local high performance obtained by light-weight layer (e.g. policies) on a file system
![Page 33: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/33.jpg)
Hardware Platform
![Page 34: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/34.jpg)
Complex Network
![Page 35: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/35.jpg)
Performance Prototype Platform
35
Example tests:
• automated deployment from containers (github)
• infrastructure management and orchestration
• simulate streamed visibility data from correlator to N
ingest nodes
• basic calibration on ingest
• calibration and imaging with suitable modelling
across 2 compute islands
• data products written to preservation (staging)
• basic quality assurance metrics visible
• distributed database performance
• some automated built in testing at various levels
• LMC interface prototyping (platform management
and telemetry)
• Execution Framework - Big Data prototyping
(Apache Spark, etc.)
• scheduling
• Significant £400k
STFC/UK investment in
hardware prototype
• Currently being installed in
Cambridge
![Page 36: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/36.jpg)
Control layer
Complex functionality
– Resilience required here
– Interface to cluster manager for resource management
– Prototyping around adding functionality to OpenStack
– Built around various open-source products integrated into broader framework
![Page 37: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/37.jpg)
Control and system components
![Page 38: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/38.jpg)
Delivering Data
Delivery System
• User Interfaces
• Delivery System Service Manager
• Delivery System Interfaces Manager
• Tiered Data Delivery Manager
Science data catalogue and Prepare
Science Product
– This component manages the catalogue
of data products and how they are
assembled for distribution including data
model conversion if needed
![Page 39: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/39.jpg)
Domain specific components
39
Not discussed in detail
here
• Some performance
tuning
• Interfacing
• I/O system?
![Page 40: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/40.jpg)
Other system Software
Compute System Software
• Compute Operating System software– The O/S will be based on a widely-adopted community version of Linux such as
Scientific Linux (Fermi Lab, et al) or CentOS (CERN, et al)
• Middleware– Messaging Layer based on TANGO’s messaging layer and/or open source
messaging layers (like ZeroMQ) is possible with effort
• Interface to TANGO– TANGO is the selected product for overall monitor and control – interfacing of other
components to TANGO will be required for e.g. monitor information from the cluster manager
• Middleware stack– Based on widely adopted versions of community-supported software that will mature
up to deployment, like OpenStack. Effort will be required during the construction period to develop enhancements and SDP-specific customisation. Effort will also be required to maintain enhancements, relevant to the hardware and software requirements. Close collaboration and convergence with large contributors (like CERN) in the open source community could lead to reduced long-term maintenance effort.
![Page 41: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/41.jpg)
Other activities
• Expect SDP to be delivered by a consortium with strong link / involvement from SKAO and academic partners
– Consortium building and leadership
– Management
– Software Engineering and System Engineering
– Integration and acceptance
– Some on-going support activities after construction
![Page 42: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/42.jpg)
Archive Estimates SKA1
Archive Growth Rate: ~ 45 – 120 Pbytes/yr
Larger including discovery space: ~300 Pbytes/yr
Data Products
• Standard products
• calibrated multi-dimensional sky images;
• time-series data
• catalogued data for discrete sources (global sky
model) and pulsar candidates
• No derived products from science teams
Further Processing and Science Extraction
at Regional Centres
![Page 43: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/43.jpg)
Regional Centres
• Data distribution and access for
astronomers
• Provide regional data centres and High
Performance Computing services
• Software development
• Support for key science teams
• Technology development and engineering
support
• Significant compute in their own right
• In UK likely to be supported by layering on
top of a coordinated compute infrastructure
– working title UK-T0
Global Headquarters
South African Site
Australian Site
Regional Centre 1
Regional Centre 2
Regional Centre 3
• Data Analytics support
• Science analysis
• Source extraction
• Bayesian analysis
• Joining of catalogues
• Visualisation
![Page 44: The SKA Science Data Processor and Regional Centres](https://reader031.vdocuments.site/reader031/viewer/2022012510/618862ff5b36a05d0917bcd6/html5/thumbnails/44.jpg)
END