the only constant is change – evolution of an hpc cluster · the only constant is change –...

1
SUPERCOMPUTING SCIENCE MISSION DIRECTORATE Discover’s Scalable Compute Units (SCUs) 3, 4, 5, 6, 8, and 9 (installed 2008–2013) were IBM iDataPlex systems spanning four generations of Intel processors (Harpertown, Nehalem, Westmere, and Sandy Bridge). Pictured at left is SCU5. Pat Izzo, NASA/Goddard Bruce Pfaff, NASA Goddard Space Flight Center Over the past 11 years, the Discover supercomputer cluster at the NASA Center for Climate Simulation (NCCS) has undergone numerous changes. Discover has hosted eight generations of Intel processors, various GPU and co-processor technologies, five generations of high-speed network interconnects, six generations of disk storage systems, and three generations of metadata storage devices in 14 scalable compute units from five different vendors. Maintaining a heterogeneous resource like Discover requires the NCCS staff to rapidly adjust and adapt to new technologies and to develop expertise across processor technologies, high-speed interconnects, networking, rack design and layout, electrical load balancing, advanced cooling technologies, system integration, and facilities design and planning. SCUs 10, 11, 12 and 13 (installed 2014–2016) are SGI Rackable systems containing Intel Haswell processors and more than 82,000 of Discover’s ~110,000 processing cores. Pictured below is SCU10. Bill Hrybyk, NASA/Goddard The Base Unit (pictured above) of the Discover supercomputer was installed in 2006 and had 130 compute nodes containing 520 Intel Dempsey processors and a total capacity of 3.3 teraflops. It replaced 100 racks of hardware containing 350 nodes with a capacity of 3.2 teraflops. Bruce Pfaff, NASA/Goddard The Only Constant Is Change – Evolution of an HPC Cluster

Upload: others

Post on 16-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Only Constant Is Change – Evolution of an HPC Cluster · The Only Constant Is Change – Evolution of an HPC Cluster. Title: 26_PfaffLR Created Date: 12/8/2017 5:44:31 PM

S U P E R C O M P U T I N GSCIENCE MISSION DIRECTORATE

Discover’s Scalable Compute Units (SCUs) 3, 4, 5, 6, 8, and 9 (installed 2008–2013) were IBM iDataPlex systems spanning four generations of Intel processors (Harpertown, Nehalem, Westmere, and Sandy Bridge). Pictured at left is SCU5. Pat Izzo, NASA/Goddard

Bruce Pfaff, NASA Goddard Space Flight Center

Over the past 11 years, the Discover supercomputer cluster at the NASA Center for Climate Simulation (NCCS) has undergone numerous changes. Discover has hosted eight generations of Intel processors, various GPU and co-processor technologies, five generations of high-speed network interconnects, six generations of disk storage systems, and three generations of metadata storage devices in 14 scalable compute units from five different vendors.

Maintaining a heterogeneous resource like Discover requires the NCCS staff to rapidly adjust and adapt to new technologies and to develop expertise across processor technologies, high-speed interconnects, networking, rack design and layout, electrical load balancing, advanced cooling technologies, system integration, and facilities design and planning.

SCUs 10, 11, 12 and 13 (installed 2014–2016) are SGI Rackable systems containing Intel Haswell processors and more than 82,000 of Discover’s ~110,000 processing cores. Pictured below is SCU10. Bill Hrybyk, NASA/Goddard

The Base Unit (pictured above) of the Discover supercomputer was installed in 2006 and had 130 compute nodes containing 520 Intel Dempsey processors and a total capacity of 3.3 teraflops. It replaced 100 racks of hardware containing 350 nodes with a capacity of 3.2 teraflops. Bruce Pfaff, NASA/Goddard

The Only Constant Is Change – Evolution of an HPC Cluster