analyse de physique sur machines risc : expériences au cern saclay 20 juin 1994
DESCRIPTION
Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994. Frédéric Hemmer Computing & Networks Division CERN, Geneva, switzerland. CERN - The European Laboratory for Particle Physics. Fundamental research in particle physics Designs, builds & operates large accelerators - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/1.jpg)
cn - fhe - jun 94-1
CERN
Analyse de Physique sur machines RISC : expériences
au CERN
SACLAY
20 JUIN 1994
Frédéric HemmerComputing & Networks Division
CERN, Geneva, switzerland
![Page 2: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/2.jpg)
cn - fhe - jun 94-2
CERN
CERN - The European Laboratory for Particle Physics
• Fundamental research in particle physics
• Designs, builds & operates large accelerators
• Financed by 19 European countries
• SFR 950M budget -operation + new accelerators
• 3,000 staff
• Experiments conducted by a small number of large collaborations:
400 physicists, 50 institutes, 18 countriesusing experimental apparatus costing 100s of MSFR
![Page 3: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/3.jpg)
cn - fhe - jun 94-3
CERN
Computing at CERN
• computers are everywhere
• embedded microprocessors
• 2,000 personal computers
• 1,400 scientific workstations
• RISC clusters, even mainframes
• estimate 40 MSFR per year (+ staff)
![Page 4: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/4.jpg)
cn - fhe - jun 94-4
CERN
Central Computing Services
• 6,000 users
• Physics data processing traditionally:
mainframes + batch
emphasis on:
reliability, utilisation level
• Tapes:300,000 active volumes22,000 tape mounts per week
![Page 5: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/5.jpg)
cn - fhe - jun 94-5
CERN
Application Characteristics
• inherent coarse grain parallelism (at event or job level)
• Fortran
• modest floating point content
• high data volumes
– disks
– tapes, tape robots
• moderate, but respectable, data rates -a few MB/sec per fast RISC cpu
Obvious candidate for RISC clusters
A major challenge
![Page 6: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/6.jpg)
cn - fhe - jun 94-6
CERN
CORE - Centrally Operated Risc Environment
• Single management domain
• Services configured for specific applications, groups
but common system management
• Focus on data -external access to tape and disk
servicesfrom CERN network,or even outside CERN
![Page 7: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/7.jpg)
Home directories& registry
CERN Network
CSF
Simulation Facility
PIAF - InteractiveAnalysis Facility
SPARCstations
Central Data Services
Shared Disk Servers
consoles&
monitors
CORE Physics Services
CERN
SHIFTData intensive services
7 IBM, SUNservers
Scalable Parallel Processors
25 H-P 9000-735 H-P 9000-750
25 H-P 9000-735 H-P 9000-750
5 H-P 9000-755100 GB RAID disk
5 H-P 9000-755100 GB RAID disk
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes
260 GBytes6 SGI, DEC, IBM servers
260 GBytes6 SGI, DEC, IBM servers
3 tape robots21 tape drives6 EXABYTEs
3 tape robots21 tape drives6 EXABYTEs
SPARCserversBaydel RAID disks
tape juke box
SPARCserversBaydel RAID disks
tape juke box
les robertson /cn
Shared Tape Servers
equipment installed or on order Jamuary 1994
![Page 8: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/8.jpg)
CERN Network
CSF
Simulation Facility
PIAF - InteractiveAnalysis Facility
SPARCstations
Home directories& registry
Central Data Services
Shared Disk Servers
consoles&
monitors
CORE Physics Services
CERN
SHIFTData intensive services
7 IBM, SUNservers
Scalable Parallel Processors
25 H-P 9000-735 H-P 9000-750
25 H-P 9000-735 H-P 9000-750
5 H-P 9000-755100 GB RAID disk
5 H-P 9000-755100 GB RAID disk
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes260 GBytes6 SGI, DEC, IBM servers
260 GBytes6 SGI, DEC, IBM servers
3 tape robots21 tape drives6 EXABYTEs
3 tape robots21 tape drives6 EXABYTEs
SPARCserversBaydel RAID disks
tape juke box
SPARCserversBaydel RAID disks
tape juke box
les robertson /cn
Shared Tape Servers
equipment installed or on order Jamuary 1994
![Page 9: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/9.jpg)
CERN Network
CSF
Simulation Facility
PIAF - InteractiveAnalysis Facility
SPARCstations
Home directories& registry
Central Data Services
Shared Disk Servers
consoles&
monitors
CORE Physics Services
CERN
SHIFTData intensive services
7 IBM, SUNservers
Scalable Parallel Processors
25 H-P 9000-735 H-P 9000-750
25 H-P 9000-735 H-P 9000-750
5 H-P 9000-755100 GB RAID disk
5 H-P 9000-755100 GB RAID disk
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes260 GBytes6 SGI, DEC, IBM servers
260 GBytes6 SGI, DEC, IBM servers
3 tape robots21 tape drives6 EXABYTEs
3 tape robots21 tape drives6 EXABYTEs
SPARCserversBaydel RAID disks
tape juke box
SPARCserversBaydel RAID disks
tape juke box
les robertson /cn
Shared Tape Servers
equipment installed or on order Jamuary 1994
![Page 10: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/10.jpg)
CERN Network
CSF
Simulation Facility
PIAF - InteractiveAnalysis Facility
SPARCstations
Home directories& registry
Central Data Services
Shared Disk Servers
consoles&
monitors
CORE Physics Services
CERN
SHIFTData intensive services
7 IBM, SUNservers
Scalable Parallel Processors
25 H-P 9000-735 H-P 9000-750
25 H-P 9000-735 H-P 9000-750
5 H-P 9000-755100 GB RAID disk
5 H-P 9000-755100 GB RAID disk
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes260 GBytes6 SGI, DEC, IBM servers
260 GBytes6 SGI, DEC, IBM servers
3 tape robots21 tape drives6 EXABYTEs
3 tape robots21 tape drives6 EXABYTEs
SPARCserversBaydel RAID disks
tape juke box
SPARCserversBaydel RAID disks
tape juke box
les robertson /cn
Shared Tape Servers
equipment installed or on order Jamuary 1994
![Page 11: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/11.jpg)
CERN Network
CSF
Simulation Facility
PIAF - InteractiveAnalysis Facility
SPARCstations
Home directories& registry
Central Data Services
Shared Disk Servers
consoles&
monitors
CORE Physics Services
CERN
SHIFTData intensive services
7 IBM, SUNservers
Scalable Parallel Processors
25 H-P 9000-735 H-P 9000-750
25 H-P 9000-735 H-P 9000-750
5 H-P 9000-755100 GB RAID disk
5 H-P 9000-755100 GB RAID disk
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
8 node SPARCcenter32 node Meiko CS-2
(Early 1994)
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes
Processors: 24 SGI; 11 DEC Alpha;9 H-P; 2 SUN; 1 IBM
Embedded disk: 1.1 TeraBytes260 GBytes6 SGI, DEC, IBM servers
260 GBytes6 SGI, DEC, IBM servers
3 tape robots21 tape drives6 EXABYTEs
3 tape robots21 tape drives6 EXABYTEs
SPARCserversBaydel RAID disks
tape juke box
SPARCserversBaydel RAID disks
tape juke box
les robertson /cn
Shared Tape Servers
equipment installed or on order Jamuary 1994
![Page 12: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/12.jpg)
cn - fhe - jun 94-12
CERN
CSF - Central Simulation Facility
• second generation, joint project with H-P
interactive hostjob queues shared,
load balanced H-P 750
tape servers
ethernet
FDDI
• 25 H-P 735s - 48 MB memory, 400MB disk• one job per processor• generates data on local disk• staged out to tape at end of job• long jobs (4 to 48 hours)• very high cpu utilisation : >97%• very reliable : > 1 month MTBI
![Page 13: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/13.jpg)
cn - fhe - jun 94-13
CERN
SHIFTScalable, Heterogeneous, Integrated, Facility
• Designed in 1990
• fast access to large amounts of disk data
• good tape support
• cheap & easy to expand
• vendor independent
• mainframe quality
• First implementation in production within 6 months
![Page 14: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/14.jpg)
cn - fhe - jun 94-14
CERN
Design choices• Unix + TCP/IP
• system-wide batch job queues
“single system image”
target Cray style & service quality
• pseudo distributed file systemassumes no read/write file sharing
• distributed tape staging model (disk cache of tape files)
– the tape access primitives are
copy disk file to tape
copy tape file to disk
![Page 15: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/15.jpg)
cn - fhe - jun 94-15
CERN
IP network
The Software Model
diskservers
cpuservers
stageservers
tapeservers
queueservers
Define functional interfaces ---- scalable heterogeneous distributed
![Page 16: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/16.jpg)
cn - fhe - jun 94-16
CERN
• Unix Tape Subsystem• (multi-user, labels, multi-file, operation)
• Fast Remote File Access System
• Remote Tape Copy System
• Disk Pool Manager
• Tape Stager
• Clustered NQS batch system
• Integration with standard I/O packages• FATMEN, RZ, FZ, EPIO, ..
• Network Operation
• Monitoring
Basic Software
![Page 17: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/17.jpg)
cn - fhe - jun 94-17
CERN
Unix Tape Control
• tape daemon
– operator interface / robot interface
– tape unit allocation / deallocation
– label checking, writing
![Page 18: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/18.jpg)
cn - fhe - jun 94-18
CERN
Remote Tape Copy System
• selects a suitable tape server
• initiates the tape-disk copy
tpread -v CUT322 -g SMCF -q 4,6 pathname
tpwrite -v IX2857 -q 3-5 file 3 file4 file5
tpread -v UX3465 `sfget -p opaldst file34`
![Page 19: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/19.jpg)
cn - fhe - jun 94-19
CERN
Remote File Access System - RFIO
high performance, reliability (improve on NFS)
• C I/O compatibility library
Fortran subroutine interface
• rfio daemon started by open on remote machine
• optimised for specific networks
• asynchronous operation (read ahead)
• optional vector pre-seek– ordered list of the records which will probably be read next
![Page 20: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/20.jpg)
cn - fhe - jun 94-20
CERN
sgi1 dec24
sun5disk pool
a disk pool is a collection of Unix file systems, possibly on several nodes, viewed as a single chunk of allocatable space
![Page 21: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/21.jpg)
cn - fhe - jun 94-21
CERN
Disk Pool Management
• allocation of files to pools– pools can be public or private
• and filesystems– capacity management
• name server
• garbage collection– pools can be temporary or permanent
• example:
• sfget -p opaldst file26
• may create file like:
• /shift/shd01/data6/ws/panzer/file26
![Page 22: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/22.jpg)
cn - fhe - jun 94-22
CERN
• implements a disk cache of magnetic tape files
• integrates: Remote Tape Copy System& Disk Pool Management
• queues concurrent requests for same tape file
• provides full error recovery -restage &/or operator control on
hardware/system errorinitiate garbage collection if disk full
• supports disk pools & single (private) file systems
• available from any workstation
Tape Stager
![Page 23: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/23.jpg)
cn - fhe - jun 94-23
CERN
Tape Stager
tape serverrtcopy tape, file
disk server
stage controlsfget file
tpread tape, file
cpu server(user job)
stagein tape, file
RFIO
independent stagecontrol for each
disk pool
![Page 24: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/24.jpg)
cn - fhe - jun 94-24
CERN
SHIFT Statusequipment installed or on order January 1994
configuration -- capacity --
cpu(CU*) disk(GB)OPAL SGI Challenge 4-cpu + 8-cpu (R4400 - 150 MHz) 290 590 Two SGI
340S 4-cpu (R3000 - 33MHz)
ALEPH SGI Challenge 4-cpu (R4400 - 150MHz) 216 200
Eight DEC 9000-400
DELPHI Two H-P 9000/735 52 200
L3 SGI Challenge 4-cpu (R4400 - 150MHz) 80 300
ATLAS H-P 9000/755 26 23
CMS H-P 9000/735 26 23
SMC SUN SPARCserver10, 4/630 22 4
CPLEAR DEC 3000-300AXP, 500AXP 29 10
CHORUS IBM RS/6000-370 15 15
NOMAD DEC 3000-500 AXP 19 15
Totals 775 1380
* CERN-Units:one CU equals approx. 4 SPECints (CERN IBM mainframe 120 600)
CERNgroup
![Page 25: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/25.jpg)
cn - fhe - jun 94-25
CERN
Current SHIFT Usage
• 60% cpu utilisation
• 9,000 tape mounts per week, 15% writestill some way from holding the active data on disk
• MTBI - cpu and disk servers400 hours for an individual server
• MTBF for disks: 160K hours
maturing service, but does not yet surpass the quality of the mainframe
![Page 26: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/26.jpg)
cn - fhe - jun 94-26
CERN
UltraNet
1 Gbps backbone
6 MBytes/secsustained
SHIFT cpuservers
SHIFT diskservers
IBM mainframe
FDDI + GigaSwitch - 2-3 MBytes sustained
SHIFT tapeservers
Ethernet + Fibronics hubs - aggregate 2 MBytes/sec sustained
Simulationservice
Homedirectories
connection to CERN & external networks
CORE Networking
![Page 27: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/27.jpg)
cn - fhe - jun 94-27
CERN
FDDI Performance(September 1993)
100 MByte disk file read/written sequentially using 32KB records
client: H-P 735 server: SGI Crimson, SEAGATE Wren 9 disk
system read write
NFS 1.6 MB/sec 300 KB/sec
RFIO 2.7 MB/sec 1.7 MB/sec
![Page 28: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/28.jpg)
cn - fhe - jun 94-28
CERN
PIAF - Parallel Interactive Data Analysis Facility
(R.Brun, A.Nathaniel, F.Rademakers CERN)
• the data is “spread” across the interactive server cluster
• the user formulates a transaction on his personal workstation
• the transaction is executed simultaneously on all servers
• the partial results are combined and returned to the user’s workstation
![Page 29: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/29.jpg)
cn - fhe - jun 94-29
CERN
PIAFworker
PIAF Architecture
PIAFclient
displaymanager
PIAF server
PIAFworker
PIAFworker
PIAFworker
PIAFworker
userpersonal
workstation
PIAF Service
![Page 30: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/30.jpg)
cn - fhe - jun 94-30
CERN
Scalable Parallel Processors
• embarrassingly parallel application -therefore in competition with workstation clusters
• SMPs and SPPs should do a better job for SHIFT than loosely coupled clusters
• computing requirements will increase by three orders of magnitude over next ten years
• R&D project started, funded by ESPRIT - GPMIMD232 processor Meiko CS-225 man-years development
![Page 31: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/31.jpg)
cn - fhe - jun 94-31
CERN
Conclusion
• Workstation clusters have replaced mainframes at CERN for physics data processing
• For the first time, we see computing budgets come within reach of the requirements
• Very large, distributed & scalable disk and tape configurations can be supported
• Mixed manufacturer environments work, and allow smooth expansion of the configuration
• Network performance is the biggest weakness in scalability
• Requires a different operational style & organisation from mainframe services
![Page 32: Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994](https://reader031.vdocuments.site/reader031/viewer/2022013012/56814400550346895db092cb/html5/thumbnails/32.jpg)
cn - fhe - jun 94-32
CERN
Operating RISC machines
• SMP’s easier to manage
• SMP’s requires less manpower
• Distributed management not yet robust
• Network is THE problem
• Much easier than mainframes, and
• ... cost effective