lessons from the beowulf bob lucas usc – lockheed martin quantum computing center oct 14, 2014

Download Lessons from the Beowulf Bob Lucas USC – Lockheed Martin Quantum Computing Center Oct 14, 2014

Post on 04-Jan-2016

213 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

Slide 1

Lessons from the BeowulfBob LucasUSC Lockheed Martin Quantum Computing CenterOct 14, 2014

SequeI met Thomas Sterling in Oct., 1988Supercomputing Research Center (SRC)MIT-trained dataflow expertReally big vocabularySkipper of the Floating Point

Supercomputing 1988Spoke of the perils of overheadRebutted by MIT professor in the audience

Guerrilla researchOften in Thomass homeDataflow execution of a linear solverWould have been more efficient than a Y/MPSupercomputing in the 1980sECL shared-memory, vector mainframesPrimarily from Cray Research~$10M

SRC Cray-2Four 250 MHz CPUsThree people

NASA Cray-2 from WikipediaSupercomputing in the 1980sECL shared-memory, vector mainframesPrimarily from Cray Research~$10M

SRC Cray-2Four 250 MHz CPUsThree people

Machines were expensive

People were cheap

NASA Cray-2 from WikipediaFET Technology RevolutionFET patent filed in 1925

MOSFET invented in 1959

COSMIC Cubes 8086s were nMOS

CMOS matured in the mid-1980sLatch-up finally addressed

New manufacuring technology launched a broad range of parallel computer architecture research1980s and early 1990s

Early 1990s Message Passing Systems(aka, Communicating Sequential Processes)PC componentsIntel Touchstone Delta512 CPUsCustom networkOSF/1

Workstation componentsIBM SP1128 RS/6000 CPUsCustom networkAIX

Contemporary Shared Memory AlternativesConvex SPP2048 PA-RISC CPUsccNUMASCI networkCray T3D2048 Alpha CPUsShared address space3D torus networkY/MP packaging

Beowulf was UnderwhelmingLowest Common DenominatorCheap PC componentsMediocre performance (10s of CPUs)Large form factorMessage passing execution modelOS from a Finnish teenager, and Don

Beowulf was UnderwhelmingLowest Common DenominatorCheap PC componentsMediocre performance (10s of CPUs)Large form factorMessage passing execution modelOS from a Finnish teenager, and Don

Mosaic was underwhelming tooBeowulf was UnderwhelmingLowest Common DenominatorCheap PC componentsMediocre performanceLarge form factorMessage passing execution modelOS from a Finnish teenager, and Don

Mosaic was underwhelming too.

As was MPI, merely a std.After a decade of prior artI Began to Take NoticeTom Blank quit MasParKnew he couldnt compete with Beowulf cost structure

Boeing engineers office equipmentIDCs dark matter

LSTC classroom outperformed the SGI OriginNot all applications need fancy networks

USC Condo complexHPC with modest institutional investmentManaged by only three people

Beowulf TriumphedHardware costs are effectively minimizedSystem software tooISV license fees often exceed hardware costVendor integrated systemsBetter form factorsCompetitive with custom systems at all but extreme scalesLow marginsLarge users still integrate their ownGoogle and Facebook among top five server manufacturersOutsourcing of infrastructureEliminate labor of system administrators and operatorsCloud purveyors have econonmies of scale

Computing Too Cheap to MeterFlops are free

Applications often used inefficientlyE.g., rectalinear meshes to track turbulent fluidsEasier than more sophisticated, adaptive grids.

Large parallel systems used inefficientlyMap-Reduce execution model easy to useVirtual machine layers make them easy to manage

False Economy?People are expensiveSophisticated codes are costly to writeConcurrancy makes them more soMitigate some of this with libraries

Electricity is expensive too

Tyranny of BeowulfNot all algorithms parallelize wellCSP execution model limits those that doUnpredictable distribution of data and operationsCommodity hardware overheads further impact scalingBeowulf cost advantage has squeezed out alternatives

Looking to the FutureNeed to change focus to maximizing human productivityReduce cognitive burden on developers and userse.g., shared address spaces

Software legacy represents huge labor investmentEvolution onto Beowulf an ongoing process, after two decadesNeed to evolve these codes into the futureYes, that means Fortran and MPI where they workAdd new features where neededLaunch ParalleX applications by typing mpirun

Threatened by diversity of rapidly evolving environmentBeowulf fostered a stable execution model for two decadesGracefully incorporated local node changesShared memory and accelerators

Revisit Execution ModelPentium core performance asymptotingRoom for innovation that wasnt possible for two decadesRediscover E-registers and other lost 1990s technology

Anton is illustrative of the engineering thats neededOrder-of-magnitude lower communication overheadI expect more application (or domain) specific systems

Thomas Sterlings current research focusInformed by three decades of prior researchDataflow, Beowulf, PIM, HTMT, ParalleXHe set us on the path to BeowulfHe could do it again