nscc training introductory class

61
NSCC High Performance Computing Cluster Introduction [19-Feb-2016]

Upload: national-supercomputing-centre-singapore

Post on 09-Feb-2017

4.950 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: NSCC Training  Introductory Class

NSCC High Performance Computing Cluster Introduction

[19-Feb-2016]

Page 2: NSCC Training  Introductory Class

•  Introduction to NSCC •  About HPC •  More about NSCC HPC cluster •  PBS Pro (Scheduler) •  Compilers and Libraries •  Developer Tools •  Co-processor / Accelerators •  Environment Modules •  Applications •  User registration procedures •  Feedback

2

The Discussion

Page 3: NSCC Training  Introductory Class

3

Introduction to NSCC

Page 4: NSCC Training  Introductory Class

CONFERENCE THEMES & INVITED SPEAKERS

u  Efforts to build exascale supercomputers Horst Simon, Bronis de Supinski

u  New non-standard processor architecture including Automata Processors & Neuromorpohic Processors Srinivas Aluru, Mircea Stan, Vern Brownell, Thomas Sohmers

u  Convolution of Supercomputing AI and biological brain Baroness Susan Greenfield, Roman Yampolskiy

u  Languages for Exascale & for human-computer interactivity Barbara Chapman, Kathy Yelick, Alan Edelman, Wen-Mei Hwu, Andrew Sorenson

u  Applications & other topics Artur Binczewski, John Feo, Michael Krajecki, Patricia Kovatch, Diego Rossinelli, John Gustafson

Bronis R. Supinski Lawrence Livermore National Laboratory

Horst Simon Lawrence Berkeley

National Laboratory

Baroness Susan Greenfield

Oxford University

Srinivas Aluru Georgia Institute of

Technology

INVITATION TO PARTICIPATEKEYNOTE SPEAKERS

Page 5: NSCC Training  Introductory Class

•  State-of-the-art national facility with computing, data and resources to enable users to solve science and technological problems, and stimulate industry to use computing for problem solving, testing designs and advancing technologies.

•  Facility will be linked by high bandwidth networks to connect these resources and provide high speed access to users anywhere and everyone.

Introduction: The National Supercomputing Centre (NSCC)

5

Page 6: NSCC Training  Introductory Class

Supporting National R&D Initiatives 1

A"rac&ngIndustrialResearchCollabora&ons2

EnhancingSingapore’sResearchCapabili&es3

6

Introduction: Objectives

Page 7: NSCC Training  Introductory Class

7

What is HPC?

Page 8: NSCC Training  Introductory Class

8

What is HPC?

•  Term HPC stands for High Performance Computing or High Performance Computer

•  Tightly coupled personal computers with high speed interconnect •  Measured in FLOPS (FLoating point Operations Per Second) •  Architectures

– NUMA (Non-uniform memory access)

Page 9: NSCC Training  Introductory Class

Major Domains where HPC is used

Engineering Analysis

•  Fluid Dynamics

•  Materials Simulation

•  Crash simulations

•  Finite Element Analysis

Scientific Analysis

•  Molecularmodelling

•  Computa&onalChemistry

•  Highenergyphysics

•  QuantumChemistry

Life Sciences

•  GenomicSequencingandAnalysis

•  Proteinfolding

•  Drugdesign•  Metabolicmodelling

Seismic analysis

•  ReservoirSimula&onsandmodelling

•  Seismicdataprocessing

9

Page 10: NSCC Training  Introductory Class

Major Domains where HPC is used Chip design & Semiconductor

•  Transistor simulation

•  Logic Simulation •  Electromagnetic

field solver

Computational Mathematics

•  Monte-Carlo methods

•  Time stepping and parallel time algorithms

•  Iterative methods

Media and Animation

• VFXandvisualiza&on

• Anima&on

Weather research

•  Atmospheric modelling

•  Seasonal time-scale research

•  -

Major Domains where HPC is used

10

Page 11: NSCC Training  Introductory Class

Major Domains where HPC is used

•  And More – Bigdata –  Information Technology – Cyber security – Banking and Finance – Data mining

11

Page 12: NSCC Training  Introductory Class

12

Introduction to NSCC HPC Cluster

Page 13: NSCC Training  Introductory Class

Objectives

•  1 Petaflop System – About 1300 nodes – Homogeneous and Heterogeneous architectures

•  13 Petabytes of Storage – One of the Largest and state of the art Storage architecture

•  Research and Industry – A*STAR, NUS, NTU, SUTD – And many more commercial and academic organizations

13

Page 14: NSCC Training  Introductory Class

HPC Stack in NSCC

Mellanox 100 Gbps Network

Intel Parallel studio Allinea Tools PBSPro

Scheduler

Lustre & GPFS

HPC Application software

Operating System RHEL 6.6 and CentOS 6.6

Fujitsu x86 Servers NVidia Tesla K40 GPU DDN Storage

Application Modules

14

Page 15: NSCC Training  Introductory Class

NSCC Supercomputer Architecture Base Compute Nodes (1160 nodes) Accelerated Nodes (128 nodes)

InfiniBand network - Fully non-blocking

Tiered storage

Ethernet NW

NSCC Peripheral Servers

VPN

NTU Peripheral Servers

NUS Peripheral Servers

GIS FAT node

15

Page 16: NSCC Training  Introductory Class

NTU Login architecture

16

Login cluster

40/80Gb/s Link

NSCC cluster

Page 17: NSCC Training  Introductory Class

17

Genomic Institute of Singapore (GIS)

National Supercomputing Center (NSCC)

2km

Connection between GIS and NSCC

Large memory node (1TB),

Ultra high speed

500Gbps enabled

2012: 300 Gbytes/week

2015:

4300 Gbytes/week

x 14

Page 18: NSCC Training  Introductory Class

NGSP Sequencers at B2 (Illumina + PacBio)

NSCC Gateway

STEP 2: Automated pipeline analysis once sequencing completes. Processed data resides in NSCC

500Gbps Primary

Link

Data Manager

STEP 3: Data manager index and annotates processed data. Replicate metadata to GIS. Allowing data to be search and retrieved from GIS

Data Manager Compute Tiered Storage

POLARIS, Genotyping & other Platforms in L4~L8

Tiered Storage

STEP 1: Sequencers stream directly to NSCC Storage (NO footprint in GIS)

Compute

1 Gbps per sequencer

10 Gbps

1 Gbps per machine

100 Gbps

10 Gbps

A*CRC-NSCC

GIS

A*CRC: A*Star Computational Resource Center GIS: Genome Institute of Singapore

Direct streaming of Sequence Data from GIS to remote Supercomputer in NSCC

2km

Page 19: NSCC Training  Introductory Class

NSCC Data Centre (Artist Impression)

19

Page 20: NSCC Training  Introductory Class

The Hardware

EDR Interconnect

• Mellanox EDR Fat Tree within cluster

•  InfiniBand connection to all end-points (login nodes) at three campuses

•  40/80/500 Gbps throughput network extend to three campuses (NUS/NTU/GIS)

Over13PB Storage

•  HSM Tiered, 3 Tiers •  I/O 500 GBps flash

burst buffer , 10x Infinite Memory Engine (IME)

~1 PFlops System

• 1,288 nodes (dual socket, 12 cores/CPU E5-2690v3)

• 128 GB DDR4 / node • 10 Large memory nodes (1x6TB, 4x2TB, 5x 1TB)

20

Page 21: NSCC Training  Introductory Class

Compute nodes

21

•  Large Memory Nodes –  9 Nodes configured with high memory –  FUJITSU Server PRIMERGY RX4770 M2 –  Intel(R) Xeon(R) CPU E7-4830 v3 @ 2.10GHz –  4 x 1 TB, 4x 2 TB, and 1x 6 TB Memory

configuration –  EDR Infiniband

•  Standard Compute nodes –  1160 nodes –  Fujitsu Server PRIMERGY CX2550 M1 –  27840 CPU Cores –  Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz –  128 GB / Server –  EDR InfiniBand –  Liquid cooling system

Page 22: NSCC Training  Introductory Class

Accelerate your computing

Accelerators nodes

• 128 nodes with NVIDIA GPUs (identical to the compute nodes) • NVIDIA K40 (2880 cores) • 368,640totalGPUcores

Visualization nodes

• 2 nodes Fujitsu Celsius R940 graphic workstations • Each with 2 x NVIDIA Quadro K4200 • NVIDIA Quadro Sync support

22

Page 23: NSCC Training  Introductory Class

Parallel file system

•  Components –  Burst Buffer

•  265 TB Burst Buffer •  500 GB/s throughput •  Infinite Memory Engine (IME)

–  Scratch •  4 PB scratch storage •  210 GB/s •  SFA12KX EXAScalar storage •  Lustre file system

–  home and secure •  4 PB Persistent storage •  GridScalar storage •  100 GB/s throughput •  IBM Spectrum Scale (formerly GPFS)

–  Archive storage •  5 PB storage •  Archive purpose only •  WOS based archive system

23

Page 24: NSCC Training  Introductory Class

IME Architecture

24

Page 25: NSCC Training  Introductory Class

Tiered File system

25

Page 26: NSCC Training  Introductory Class

NSCC Storage

26

Tier0BurstBuffer

Tier0ScratchFS

Tier1Hom

eFS

Tier1ProjectFS

Tier2Archive

265TB500GB/s

4PB210GB/s

4PB100GB/s

WOS Active Archive InfiniteMemory

Engine GRIDScalerGPFS®Storage

HSM

5PB20TB/h

EXAScalerLustre®Storage

Page 27: NSCC Training  Introductory Class

Software Stack

Operating System CentOS 6.6

Scheduler PBS Pro

Compilers GCC

Intel Parallel Studio

Libraries GNU, Intel MKL

Allinea tools

GPGPU CUDA Toolkit 7.5

Environment Modules

27

Page 28: NSCC Training  Introductory Class

NSCCsystemisexpectedtobereadyby15thMar2016*

28

The information on the following slides are only an indicative information and likely be confirmed by 15th of Mar 2016

Page 29: NSCC Training  Introductory Class

PBS Professional (Scheduler)

29

Page 30: NSCC Training  Introductory Class

Why PBS Professional (Scheduler)?

30

§  Workload management solution that maximizes the efficiency and utilization of high-performance computing (HPC) resources and improves job turnaround

Robust Workload Management

§  Floating licenses §  Scalability, with flexible queues §  Job arrays §  User and administrator interface §  Job suspend/resume §  Application checkpoint/restart §  Automatic file staging §  Accounting logs §  Access control lists

Advanced Scheduling Algorithms

§  Resource-based scheduling §  Preemptive scheduling §  Optimized node sorting §  Enhanced job placement §  Advance & standing reservations §  Cycle harvesting across workstations §  Scheduling across multiple complexes §  Network topology scheduling §  Manages both batch and interactive work §  Backfilling Reliability, Availability and

Scalability §  Server failover feature §  Automatic job recovery §  System monitoring §  Integration with MPI solutions §  Tested to manage 1,000,000+ jobs per day §  Tested to accept 30,000 Jobs per minute §  EAL3+ security § Checkpoint support

Page 31: NSCC Training  Introductory Class

Process Flow of a PBS Job

1. User submits job

2. PBS server returns a job ID

3. PBS scheduler requests a list of resources from the server *

4. PBS scheduler sorts all the resources and jobs *

5. PBS scheduler informs PBS server which host(s) that job can run on *

6. PBS server pushes job script to execution host(s)

7. PBS MoM executes job script

8. PBS MoM periodically reports resource usage back to PBS server *

9. When job is completed PBS MoM copies output and error files

10. Job execution completed/user notification sent

HOST A HOST B HOST C

PBS SCHEDULER

PBS SERVER pbsworks

ncpus mem host

pbsworks on HOST A

pbsworks

Note: * This information is for debugging purposes only. It may change in future releases.

31

ClusterNetwork

Page 32: NSCC Training  Introductory Class

Compute Manager GUI: Job Submission Page

•  Applications panel –  Displays the applications available on the registered PAS server

•  Submission Form panel –  Displays a job submission form for the application selecting the Applications panel

•  Directory Structure panel –  Displays the directory structure of the location specified in the Address box –  Files panel –  Displays the contents of the directory, files, and subdirectories selected in the Directory Structure panel

32

DirectoryStructure

Files

Applica&ons

Page 33: NSCC Training  Introductory Class

Job Queues & Scheduling Policies

33

QueueName Queuetype JobrunWmelimit

Noofcoresavailable

DescripWon

Long Batch 240Hours 1024 Jobsareexpectedtorunlonger&me

Development Interac&ve 24Hours 48 Coding,profilinganddebugging

Normal DefaultBatch 3Days 27000 Defaultqueue

LargeMemory Batch - 360 Jobsdispatchedbasedonmemoryrequirement

GPU GPU batch - 368,640 (CUDA)

SpecificforGPUjobs

VisualizaWon Interactive 8 Hours 1 Highendgraphicscard

ProducWon Batch - 480Cores GISqueue

Page 34: NSCC Training  Introductory Class

Compilers&Libraries

34

Page 35: NSCC Training  Introductory Class

35

Compilers and Libraries at a glance

Page 36: NSCC Training  Introductory Class

Parallel programming OpenMP

•  Available compilers (gcc/gfortran/icc/ifort) – OpenMP (not openmpi, Used mainly in SMP programming)

•  OpenMP (Open Multi-Processing) •  OpenMP is an approach and OpenMPI is an implementation of MPI •  An API for shared-memory parallel programming in C/C++ and Fortran •  Parallelization in OpenMP achieved through threads •  Programming OpenMP is easier as it involves only pragma directive •  OpenMP program cannot communicate to the processor over network •  Different stages of the program uses different number of threads •  A typical approach is demonstrated through the below image

36

Page 37: NSCC Training  Introductory Class

Parallel Programming MPI

•  MPI – MPI stands for Messaging Passing Interface – MPI is a library specification – MPI implementation is typically a wrapper to standard compilers

such as C/Fortran/Java/Python – Typically used in Distributed memory communication

37

Page 38: NSCC Training  Introductory Class

38

DeveloperTools

Page 39: NSCC Training  Introductory Class

39

Allinea DDT

•  DDT – Distributed Debugging tool from Allinea •  Graphical interface for debugging

– Serial applications/codes – OpenMP applications/codes – MPI applications/codes – CUDA applications/codes

•  You control the pace of the code execution and examine execution flow and variables

•  Typical Scenario – Set a point in your code where you want execution to stop – Let your code run until the point is reached – Check the variables of concern

Page 40: NSCC Training  Introductory Class

40

Allinea MAP

•  MAP – Application Profiling tool from Allinea •  Graphical interface for profilling

– Serial applications/codes – OpenMP applications/codes – MPI applications/codes

Page 41: NSCC Training  Introductory Class

41

Allinea MAP

•  Running your code with MAP – $moduleloadimpi/5.1.2– $mpiicc-g-O0-owave_cwave_c.c– $moduleloadmap/a.b.c– $mapmpiexec–n4./wave_c20

Page 42: NSCC Training  Introductory Class

42

Allinea MAP

Page 43: NSCC Training  Introductory Class

43

Co-processor/Accelerators

Page 44: NSCC Training  Introductory Class

GPU

•  GPUs – Graphic Processing Units were initially made to render better graphics performance

•  With the amount of research put on GPUs, it was identified that GPUs can perform better with Floating Point Operations as well

•  The term GPU changed to GPGPUs (General Purpose GPUs) •  CUDA Toolkit includes compiler, math libraries, tools, and

debuggers

44

Page 45: NSCC Training  Introductory Class

GPU in NSCC

•  GPU Configuration – Total 128 GPU nodes – Each server with 1 Tesla K40 GPU – 128 GB host memory per server – 12GB device memory – 2880 CUDA Cores

•  Connect to GPU server – To compile GPU application:

•  Submit interactive job requesting for GPU resource •  Compile job using NVCC compiler

– To submit GPU job •  Flexible to among qsub for login nodes •  OR login to compute manager

45

Page 46: NSCC Training  Introductory Class

46

Environment Modules

Page 47: NSCC Training  Introductory Class

What is Environment modules

•  Environment modules helps to dynamically load/unload environment variables such as PATH, LD_LIBRARY_PATH, etc.,

•  Environment modules are based on module files which are written in TCL language

•  Environment modules are shell independent •  Helpful to maintain different version of same software •  Flexibility to create module files by the users

47

Page 48: NSCC Training  Introductory Class

ApplicaWons

48

Page 49: NSCC Training  Introductory Class

MolecularDynamics

ComputaWonalChemistry

Compatible Applications

49

Page 50: NSCC Training  Introductory Class

Compatible Applications

EngineeringApplicaWons

QuasiparWclecalculaWonQuantumChemistry

NumericalAnalysis Weatherresearch

50

Page 51: NSCC Training  Introductory Class

Genomicanalysis

QuantummechanicscalculaWon

51

Compatible Applications

Page 52: NSCC Training  Introductory Class

August 27, 2015 52

Compatible Applications

Page 53: NSCC Training  Introductory Class

Proposed services to be offered

•  Computational resources •  Storage services •  Interactive Job submission portal •  Customized portal to report issues •  Request for a service via portal •  Report your issue via Portal/e-Mail/Phone •  Compile your own code •  Get advice to compile/optimize your code •  Also compile/optimize on your behalf •  Share and collaborate with others

53

Page 54: NSCC Training  Introductory Class

Where is NSCC

•  NSCC Petascale supercomputer in Connexis building

•  40Gbps links extended to NUS, NTU and GIS

•  Login nodes are placed in NUS, NTU and GIS datacenters

•  Access to NSCC is just like your local HPC system

54

1FusionopolisWay,Level-17ConnexisSouthTower,Singapore138632

Page 55: NSCC Training  Introductory Class

Supported Login methods

•  How do I login –  SSH

From a Windows PC use Putty or any standard SSH client software hostname is <to be decided> use NSCC Credentials From Linux machine, use ssh username@<to be confirmed> From MAC, open terminal and ssh username@<to be confirmed>

–  File Transfer SCP or any other secure shell file transfer software from Windows Use the command scp to transfer files from MAC/Linux

–  Compute Manager Open any standard web browser In the address bar, type https://<to be decided> Use NSCC credentials to login

–  Outside campus Connect to Campus VPN gain above mentioned services

55

Page 56: NSCC Training  Introductory Class

NSCC HPC Support (Proposed to be available by 15th Mar)

•  Corporate Info – web portal http://nscc.sg http://beta.nscc.sg

•  NSCC HPC web portal http://help.nscc.sg

•  NSCC support email [email protected]

•  NSCC Workshop portal http://workshop.nscc.sg

56

Page 57: NSCC Training  Introductory Class

57

Help us improve. Take the online survey!

Visit:h`p://workshop.nscc.sg>>Survey

Page 58: NSCC Training  Introductory Class

Proposed Help portal

58

FAQs of NSCC

Login to NSCC

Page 59: NSCC Training  Introductory Class

RegistraWonProcedures

59

Page 60: NSCC Training  Introductory Class

Registration Procedure

60

Page 61: NSCC Training  Introductory Class

Web Site : http://www.nscc.sg Helpdesk : https://help.nscc.sg Email : [email protected] Phone : +65 6645 3412

61