cross council ict conference may 2004 1 high performance computing ron perrott chairman high end...
TRANSCRIPT
Cross Council ICT Conference May 2004 1
High Performance Computing
Ron Perrott
ChairmanHigh End Computing Strategy Committee
Queen’s UniversityBelfast
Cross Council ICT Conference May 2004 2
A high performance computer is a hardware and software system that provides close to the maximum performance that can currently be achieved.
=> parallelism=> state of the art
technology=> pushing the limits
What is a high performance computer? Why do we need them?
Computational fluid dynamics,protein folding, climate modeling, national security, in particular for cryptanalysis and for simulation, etc.
Economy, security, health and well-being of the country.=> Scientific discovery=> Social impact=> Commercial potential
Cross Council ICT Conference May 2004 3
HPC – UK
• Important to research in many scientific disciplines
• Increasing breadth of science involved• High UK HPC international activities• Contributions to and benefits for UK
industry
Cross Council ICT Conference May 2004 4
UK projects
• Atomic, Molecular & Optical Physics• Computational Biology• Computational Radiation Biology and Therapy• Computational Chemistry• Computational Engineering - Fluid Dynamics• Environmental Modelling• Cosmology• Particle Physics• Fusion & Plasma Microturbulence• Accelerator Modelling• Nanoscience• Disaster Simulation=> computation has become as important as theory
and experiment in the conduct of research
Cross Council ICT Conference May 2004 5
Whole systems
• Electronic Structure - from atoms to matter
• Computational Biology - from molecules to cells and beyond
• Fluid Dynamics - from eddies to aircraft• Environmental Modelling - from oceans
to the earth• From the earth to the solar system ? • ……And on to the Universe
Cross Council ICT Conference May 2004 6
Technology Trends: Microprocessor Capacity
2X transistors/Chip Every 1.5 years
Called “Moore’s Law”
Microprocessors have become smaller, denser, and more powerful.Not just processors, bandwidth, storage, etc. 2X memory and processor speed and ½ size, cost, & power every 18 months.
Gordon Moore, co-founder of Intel 1965
Number of devices/chip doubles every 18 months
Cross Council ICT Conference May 2004 7
J. DongarraJ. Dongarra
- Listing of the 500 most powerful Computers in the World
- Yardstick: LINPACKAx=b, dense problem
- Updated twice a yearSC‘xy in the States in NovemberMeeting in Mannheim, Germany in
June
- All data available from www.top500.org
Cross Council ICT Conference May 2004 8
Earth Simulator
ASCI WhitePacific
EDSAC 1UNIVAC 1
IBM 7090
CDC 6600
IBM 360/195CDC 7600
Cray 1
Cray X-MPCray 2
TMC CM-2
TMC CM-5 Cray T3D
ASCI Red
1950 1960 1970 1980 1990 2000 2010
1 KFlop/s
1 MFlop/s
1 GFlop/s
1 TFlop/s
1 PFlop/s
Scalar
Super Scalar
Vector
Parallel
Super Scalar/Vector/Parallel
Moore’s Law
1941 1 (Floating Point operations / second, Flop/s)1945 100 1949 1,000 (1 KiloFlop/s, KFlop/s) 1951 10,000 1961 100,000 1964 1,000,000 (1 MegaFlop/s, MFlop/s) 1968 10,000,000 1975 100,000,000 1987 1,000,000,000 (1 GigaFlop/s, GFlop/s) 1992 10,000,000,000 1993 100,000,000,000 1997 1,000,000,000,000 (1 TeraFlop/s, TFlop/s) 2000 10,000,000,000,000 2003 35,000,000,000,000 (35 TFlop/s)
(103)
(106)
(109)
(1012)
(1015)
Cross Council ICT Conference May 2004 9
TOP500 – Performance - Nov 2003
1.17 TF/s
528 TF/s
35.8 TF/s
59.7 GF/s
403 GF/s
0.4 GF/s
Fujitsu'NWT' NAL
NECES
Intel ASCI RedSandia
IBM ASCI WhiteLLNL
N=1
N=500
SUM
1 Gflop/s
1 Tflop/s
100 Mflop/s
100 Gflop/s
100 Tflop/s
10 Gflop/s
10 Tflop/s
1 Pflop/s
Laptop
(1015)
(1012)
(109)
Cross Council ICT Conference May 2004 10
Earth Simulator
• Homogeneous, Centralized, Proprietary, Expensive!
• Target Application: CFD-Weather, Climate, Earthquakes
• 640 NEC SX/6 Nodes (mod)– 5120 CPUs which have vector ops– Each CPU 8 Gflop/s Peak
• 40 TFlop/s (peak)• ~ 1/2 Billion £ for machine, software, &
building• Footprint of 4 tennis courts• 7 MWatts
– Say 10 cent/KWhr - $16.8K/day = $6M/year!
• Expect to be on top of Top500 until 60-100 TFlop ASCI machine arrives
• From the Top500 (November 2003)
Cross Council ICT Conference May 2004 11
HPC Trends
• Over the last 10 years the range for the Top500 has increased greater than Moore’s Law
• 1993:– #1 = 59.7 GFlop/s– #500 = 422 MFlop/s
• 2003:– #1 = 35.8 TFlop/s– #500 = 403 GFlop/s
Cross Council ICT Conference May 2004 12
November 2003
Manufacturer Computer Rmax Tflop/s
Installation Site Year # Proc Rpeak Tflop/s
1 NEC Earth-Simulator 35.8 Earth Simulator Center
Yokohama 2002 5120 40.90
2 Hewlett-
Packard ASCI Q - AlphaServer SC ES45/1.25
GHz 13.9
Los Alamos National Laboratory Los Alamos
2002 8192 20.48
3 Self Apple G5 Power PC
w/Infiniband 4X10.3
Virginia Tech Blacksburg, VA 2003 2200 17.60
4 DellPowerEdge 1750 P4 Xeon 3.6 Ghz
w/Myrinet9.82
University of Illinois U/C Urbana/Champaign
2003 2500 15.30
5Hewlett-
Packard rx2600 Itanium2 1 GHz Cluster –
w/Quadrics 8.63
Pacific Northwest National Laboratory Richland
2003 1936 11.62
6 Linux NetworX Opteron 2 GHz,
w/Myrinet8.05
Lawrence Livermore National Laboratory
Livermore
2003 2816 11.26
7 Linux NetworXMCR Linux Cluster Xeon 2.4 GHz –
w/Quadrics 7.63
Lawrence Livermore National Laboratory
Livermore
2002 2304 11.06
8 IBM ASCI White, Sp Power3 375 MHz 7.30Lawrence Livermore National Laboratory
Livermore
2000 8192 12.29
9 IBM SP Power3 375 MHz 16 way 7.30 NERSC/LBNL Berkeley 2002 6656 9.984
10 IBMxSeries Cluster Xeon 2.4 GHz –
w/Quadrics 6.59
Lawrence Livermore National Laboratory
Livermore
2003 1920 9.216
50% of top500 performance in top 9 machines; 131 system > 1 TFlop/s; 210 machines are clusters
Cross Council ICT Conference May 2004 13
Performance Extrapolation
N=1
N=500
Sum
1 GFlop/s
1 TFlop/s
1 PFlop/s
100 MFlop/s
100 GFlop/s
100 TFlop/s
10 GFlop/s
10 TFlop/s
10 PFlop/s
TFlop/sTo enter the list
PFlop/sComputer
Blue Gene130,000 proc
ASCI P12,544 proc
1015
1012
Cross Council ICT Conference May 2004 14
Taxonomy
• Special purpose processors and interconnect
• High Bandwidth, low latency communication
• Designed for scientific computing
• Relatively few machines will be sold
• High price
• Commodity processors and switch
• Processors design point for web servers & home pc’s
• Leverage millions of processors
• Price point appears attractive for scientific computing
Capability ComputingCluster Computing
Cross Council ICT Conference May 2004 15
UK Facilities
• Main centres in ManchesterEdinburgh and Daresbury
• Smaller centres around UK
Cross Council ICT Conference May 2004 16
HPCx
• Edinburgh and CCLRC• IBM 1280 processor POWER4• Currently 3.5 Tflop/s to 6.0 Tflop/s, July• October 2006 up to 12.0 Tflop/s
Cross Council ICT Conference May 2004 17
CSAR
• University of Manchester/Computer Sciences Corporation
• 256 Itanium2 processor SGI Altix (Newton) - Jun 2006; peak performance of 5.2 Gflop/s
• 512 processor Origin3800 (Green)- Jun 2006
Cross Council ICT Conference May 2004 18
Hector – High End Computing Terascale Resource
• Scientific Case• Business case• Peak performance of 50 to 100 Tflop/s
by 2006, doubling to 100 to 200 Tflop/s after 2 years, and doubling again to 200 to 400 Tflop/s 2 years after that.
• Oak Ridge National Laboratory 100 Tflop/s in 2006 250 Tflop/s in 2007
Cross Council ICT Conference May 2004 19
ANL
UK US – Teragrid HPC-Grid Experiment
TeraGyroid: Lattice-Boltzmann simulations of defect dynamics in amphiphilic liquid crystals
Cross Council ICT Conference May 2004 20
TeraGyroid - Project Partners
• Teragrid sites at: – ANL
• Visualization, Networking
– NCSA• Compute
– PSC• Compute, Visualization
– SDSC• Compute
• Reality Grid partners: – University College London
• Compute, Visualization, Networking
– University of Manchester• Compute, Visualization, Networking
– Edinburgh Parallel Computing Centre
• Compute
– Tufts University• Compute
• UK High-End Computing Services– HPCx -University of Edinburgh and
CCLRC Daresbury Laboratory• Compute, Networking, Coordination
– CSAR -Manchester and CSC• Compute and Visualization
Cross Council ICT Conference May 2004 21
TeraGyroid - Results
• Linking these resources allowed computation of the largest set of lattice-Boltzmann (LB) simulations ever performed, involving lattices of over one billion sites
• Won SC03 HPC Challenge for “Most Innovative Data-Intensive Application”
• Demonstrated extensive use of the US UK infrastructure