astrophysical applications on superclusters matthew bailes swinburne centre for astrophysics and...
TRANSCRIPT
![Page 1: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/1.jpg)
Astrophysical Applicationson
Superclusters
Matthew Bailes
Swinburne Centre for
Astrophysics and
Supercomputing
![Page 2: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/2.jpg)
![Page 3: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/3.jpg)
Outline• No:
– Linpack Mflops
– latencies
– bandwidths
– evangelism
• Why a Supercluster?
• What is the Supercluster?
• How do we use the Supercluster?
• What does it do?
![Page 4: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/4.jpg)
Why a Supercluster?
• Swinburne wants reputation.
• Hypothesis:– 30 times the power– Six years of Moore’s law
• We can do problems 30x as complex as other groups.
![Page 5: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/5.jpg)
Centre Goals:
• Fundamental Research.
• Public Outreach and Education.
• Commercial Supercomputing.– Astrophysical Special Effects– Cluster Monitoring Tools– Commercial Rendering
![Page 6: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/6.jpg)
What is the Supercluster?• Supercluster sounds better than Beowulf if you
are an astronomer.• Design Goals SSI I (1998):
– No one component worth more than A10K – Order of magnitude more than single workstation.– Dedicated resource. (dispel various myths)– 10 GB scratch/node.– 10 MB/s IO node-node.– Decent fortran/C/C++ compiler.
![Page 7: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/7.jpg)
Case Study: CSIRO Astronomy
• 1984: VAX 11/780
• 1989: Convex C2 ( > 10 times speed up)
• 1995: Power Challenge ( 10 processors )
• 1999: Linux Boxes
• Unless package supports parallelism, users won’t use clusters or even SMP/Numa unless their science is obviously constrained.
![Page 8: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/8.jpg)
Theorists:
• Possess and use clusters effectively.
• Know what MPI is.
• Can’t get money.
![Page 9: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/9.jpg)
SSI I (Jan 1998)
• 16 DEC 500 MHz alphas
• 2MB cache
• 192 MB RAM
• 13 GB disk
• 24-port CISCO switch
• MPICH/f77/C++/FFTw/emacs/gcc
Zeroeth Law of Cluster Computing:
Cluster Computing is inevitable ifyour budget is finite.
![Page 10: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/10.jpg)
SSI II (Nov 1998).
• SSI I + 8 x 600 MHz DECs 4 MB cache.
Corollary:
Your first cluster is your happiest.
First Law of Cluster Computing:
Your cluster soon becomes hetereogeneous.
![Page 11: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/11.jpg)
SSI III (March 1999)
• SSI II + – 41 500 MHz ev6 processors– 512 MB RAM/node– 18 GB disk/node
• CISCO 5500 switch– 3.2 Gb/s backplane
• Virtual Reality Theatrette– Seats 37
Second Law of Cluster Computing:
MTBF = MTBF0/N
![Page 12: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/12.jpg)
How do we use the Supercluster?
• Linux Workstations. (despite free OS)
• No batch system (just 3 “power” users).
• Home-grown MPI programs.
• C++/fortran/java.
![Page 13: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/13.jpg)
Problems:
• Distributed TB disk rarely has > 10% free.
• MPI hangs on FPE or “p4pg” errors.
• CPUs too powerful for fast ethernet and tape drive on some applications.
• Difficult to monitor.
![Page 14: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/14.jpg)
Applications.
• Neutron Star Searches.– Looked at 10% of the Southern Sky– Recorded 1.4 TB in 21 days.– 1 ev56 workstation take 7 years.– SSI III took 25 days.
• Discovered 7 “millisecond” pulsars.
– Could scale to 1000 nodes on TCP/IP.
17 MB 256MB FFT Search Fold Save
![Page 15: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/15.jpg)
Discovery Implications:
• Discovered most relativistic Neutron Star + white dwarf binary known.
• Emit gravitational waves– Coalesce in 7 Gyr.
• Population of ultra-relativistic systems.
![Page 16: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/16.jpg)
Problems.
• Most interesting systems are relativistic.
• Full sensitivity requires coherent addition.
• If observation time > 10 minutes, computational penalty becomes very large.
dideddaeia
sinsin
![Page 17: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/17.jpg)
Coherent Dedispersion.
• Problem:– Cosmic Signals are Weak– Cosmic radio signals propagate at v!=c
• In 1971 new method proposed:– record electric field– Apply numerical filter to it.
![Page 18: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/18.jpg)
![Page 19: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/19.jpg)
What does this mean?
• 20 MHz = 20 MB/second.
• 200 times real time to process (ev6)
• Gives 50 nanosecond time resolution
• Need 7*8 hour observations to do science– One node 1.5 yr– 50 nodes 9 days– 1985 VAX 11/780 (one century)
![Page 20: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/20.jpg)
Discovered?
• Millisecond pulsars emit short (1us wide) pulses across GHz bandwidths– Implies seed areas of 30 cm or less
• PSR 0437-4715 in a 5.7 day orbit– 1 Mkm in radius
a-b = 180.1 mm a
b
![Page 21: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/21.jpg)
Future:• Search for us wide pulses in SN 1987A
– 25 day search
• HIPASS - 600 GB in < 12 hours.
• SSI III + servernet can mimic CSIRO’s correlator
• SSI IV:– ES40 + TB disk
• SSI V:– 128 nodes + Inifiniband/servernet II?
![Page 22: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/22.jpg)
Conclusions:
• Clusters are too hard to code for most astronomers. MPIwhat?
• Breakthroughs are possible with radical increases in computer power.
![Page 23: Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing](https://reader035.vdocuments.site/reader035/viewer/2022070402/56649f265503460f94c3c972/html5/thumbnails/23.jpg)
ww
w.sw
in.edu.au/astronom
y