data, storage & information managementconference.eresearch.edu.au/wp-content/uploads/... ·...
TRANSCRIPT
Data, storage & information management
collaboration at the frontiers of science and technology
Jakub T. Mościcki, CERN IT
eResearch 2019, Oct 22, Brisbane
Accelerator
1232 dipole magnets 1.9K (-271.3C)
0.99999% c2 beams x 362MJ
15m 30t
at 5 knots at 2200km/h
LHCbATLAS
ALICE
CMS Experiments& Detectors
46m
25m
14,000t= 1.5x
11,000 rounds per second 25 MHz collision rate
aligned with sub-mm precision!
Education
8 Middle East countries together
Model
1954 Science for Peace
Collaboration Fundamental Research
2019
23 Member States70 Countries
120 Nationalities600 Universities
1959 PS 628m
1976 SPS7 km
1989 LEP (electrons)
2008 LHC (protons) 27 km
1984 Workshop: LHC in LEP tunnel ?
Generations of accelerators
v=const
F
Illustra(on of Run 3 tracking problem: 2ms (meframe shown (10% of total)
R.Jones, European Strategy for Particle Physics, Grenada May 2019
peak 60GB/s
140/300 PB60,000 disks
10 GB/s
330PB
~105 cores
peak 100 GB/s
Data recording & processing at CERN
10 GB/s
WLCG Grid
Worldwide LHC Computing Grid
global data distribution systemstorage: ~1EB and 1 billion files transfer rates: 60GB/s
data processing~170 sites
1 million cores send jobs to data
File Transfer Service at CERN
dedicated networkLHCOPN
Tier-2: simulation, end-user analysis access
efficiency
funding model
Tier-0: data recording, long-term archival, distribution
RAWTier-1: reprocessing, storage, analysis
RAW
Tape Storage• Archival and backup
• Cheap media, expensive infrastructure
• Energy efficient
• Access
• sequential streaming ++
• random access --
• Rewrite (“repack”) O(n2)
user
cache? buffer?
IOPS
HSM?
CTA
Disk Storage• Reprocessing and Analysis
• Cost control
• use cheap hardware
• compensate in software
Open Source Storage cern.ch/eos
nodes (100s)
disks (24-196)
user
• Tunable redundancy layouts (RAIN) • replication, erasure coding
• Storage hierarchy • SSD for metadata • HDD for data
Data Formatsunstructured RAW dataindependent events compressed into big blocks
1 2 3 4 5 6 7 8
FileN
1..8
x1 x2 x3 x4 x5 x6 x7 x8
File
y1 y2 y3 y4 y5 y6 y7 y8
z1 z2 z3 z4 z5 z6 z7 z8
v1 v2 v3 v4 v5 v6 v7 v8
………..
C(x)
File
C(y) C(z) C(v)…..
structured analysis dataoptimized for volume and access
A.Peters/CERN
Data Access Patterns
Credit: A.Peters, J.Blomer / CERN
Data formats
sparse access: optimized LAN&WAN protocol
latency compensation:predictable async multi-byte-range request
remote access: sequential forward seeking IO
Preservation
Bit preservation not a problem How to read old formats?
LEP data ~100TB Metadata
• Digital memory of organization easily lost
Data lifetime >> experiment lifetime
Years…
• Data is expensive to produce
Computing and Storage Evolution
Source: B.Panzer / CERN
1969 Apollo Guidance Computer 2K RAM, 1.024MHz clock
CPU
Disk100PB
Challenges for HL-LHC
• Cannot rely on pure scaling of technology
• it slows down
• Cannot purchase more
• budget is flat
Clever solutions needed• Don’t store raw data: pre-processed events only
• Re-organize workflows
• organized staging from cheap storage = needs less expensive storage (Tape Carousel)
• Smarter global caching (Data Lakes)
• Tune redundancy: erasure coding, 2nd global replica (?)
• …
Large collaborations• Global community: 40,000 email boxes, 600 universities
• 5,000-7,000 users at CERN at any time
• ATLAS: 3 thousand physicists from 180 institutions, 1500 PhD students, 900 published papers (so far)
• long tail of small and medium experiments
• Hierarchical, distributed organization but…
• people communicate and share information across formal structures
Back in 1989Many of the discussions of the future at CERN and the LHC era end with the question Yes, but how will we ever keep track of such a large project?
Hierarchical information does not model the real world
• Your filesystem • Photo collections • Email folders • Dynamic work environments
Centralized system is impractical
• continuous & distributed information update
https://www.w3.org/History/1989/proposal.html
The problems of information loss may be particularly acute at CERN (…) CERN is a model in miniature of the rest of world in a few years time. CERN meets now some problems which the rest of the world will have to face soon. https://www.w3.org/History/1989/proposal.html
http://cds.cern.ch/record/1164396
WWW
Web principles…• Remote access from any type of device
• Decentralised update and modification
• Integrating existing information sources (data)
• Layered: separate the information storage from display
• Well defined interface
…still true in cloud services Era
• Access from mobile phones, laptops, …
• Collaborative and distributed update
• A connected mesh of services integrating seamlessly into everyday workflows
• ….
Integrated Services for Data Science at CERN
Users collaborate on data using an increasing number of applications.
Data available on all devices: mobile, laptops, desktops
Data easily sharable with individuals and groups
Concurrent editing
Web-based Analysis
Ready-to-go environment “one click away”
Integrated with entire data repository
Future Federated Analysis Platform
One-click to create user groups, share projects and data
Domestic and remote users in the same collaborative workflow.
Advancing state of the art
Application&data workflow.
GridIntegrate distributed
computing capabilities with data
sharing systems.
Full metadata awarenessin the research workflow.
PUBLISH( PROPOSAL(
DATA((COLLECTION(
DATA((ANALYSIS(
Mesh
Commitment to Open Source
http://cds.cern.ch/record/1164399
The document that officially put the World Wide Web into the public
domain on 30 April 1993.
Many exciting challenges ahead for the LHC computing and distributed collaboration
Pushing frontiers of science and technology together with a worldwide community
Community Service
X
NRENs%HEP%&%Physics%
Universi2es% Companies%
cs3community.org
Storage Operations at CERN❖ Physics Online Storage: EOS — 250 PB, 70GB/s RW
❖ User Storage: AFS, CERNBox — 5B files
❖ Infrastructure Storage: CEPH, S3, Filers — 30 KHz ops
❖ Software Distribution Storage: CVMFS — 900 repositories
❖ Archival Storage: CASTOR — 330 PB
43
… …small / medium experiments