high performance computing systemsdshook/cse566/lectures/exascale.pdf · programming models ......

38
High Performance Computing Systems Exascale Computing Doug Shook

Upload: others

Post on 27-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

High Performance Computing Systems

Exascale Computing

Doug Shook

Page 2: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

2

Exascale Computing How many flops?

Why is this an important milestone?

What challenges would you anticipate?

Page 3: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

3

Challenges DOE has found 10 challenges to exascale computing

What must we do to meet these challenges?

When will these challenges be met?

Lucas, et. Al, “Top Ten Exascale Research Challenges” 2014

Page 4: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

4

#1 Energy Efficiency Current usage– Can we simply increase energy usage?

What parts of the system are affected by energy efficiency?

Evolutionary vs. revolutionary

Page 5: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

5

Near Threshold Voltage (NVT) What is the threshold voltage?

Energy advantages of operating near threshold voltage

Problems?

Page 6: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

6

Energy Efficient Architecture How much energy does it take to perform 1 FP operation?

Page 7: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

7

Energy Efficient Interconnects

Page 8: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

8

On Chip Power Management

Page 9: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

9

System Scale Power Management Power Distribution

Cooling

Packaging

Page 10: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

10

#2 Interconnect Technology Data Movement Energy and Bandwidth

Page 11: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

11

#2 Interconnect Technology On-die Interconnect Fabric

Inter-chip Network Integration

Photonics

Page 12: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

12

#3 Memory Technology Memory Capacity

Page 13: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

13

#3 Memory Technology Energy

Scaling

Resiliance

Page 14: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

14

#4 Scalable System Software Research Directions

Lightweight OS

Runtime Systems

Introspection

Energy Management

Page 15: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

15

#5 Programming Systems

Page 16: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

16

#5 Programming Systems Programming Models

Compilers

Page 17: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

17

#6 Data Management Offensive vs. Defensive I/O

General I/O Challenges

Offensive I/O Challenges

Defensive I/O Challenges

Page 18: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

18

#7 Exascale Algorithms Multicore friendly vs. Multicore aware

Communication Avoiding Algorithms

Synchronization Reduction

Multi-physics algorithms

Multi-scale algorithms

Energy Efficient Algorithms

Page 19: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

19

#8 Algorithms for Discovery, Design, and Decision Uncertainty Quantification

Optimization

Page 20: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

20

#9 Resiliance and Correctness Hardware Support

Programming Models

Algorithmic based fault tolerance

Correctness

Page 21: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

21

#10 Scientific Productivity Research Directions

Page 22: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

22

Co-Design and Integration Framework Execution Model

Architecture

Performance Metrics

Page 23: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

23

Integration Framework

Page 24: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

24

Design Process

Page 25: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

25

Modeling and Simulation

Page 26: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

26

Recommendations

Page 27: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

27

Current State Four countries currently have plans for exascale systems– Economic considerations?– Political considerations?

Reed, Dongarra, “Exascale Computing and Big Data: The Next Frontier” 2015

Page 28: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

28

Exascale Computing Project National Strategic Computing Initiative (2015)– Unite HPC and Big Data– Preserve US Dominance in HPC– Improve interoperability between supercomputers– Provide widespread access and training to

researchers– Develop post-silicon technologies

Messina, Lee, “The Exascale Computing Project” 2017

Page 29: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

29

Exascale Computing Project Lead Agencies– DoE, NSF, DoD

7 year project– $3.5-5.7 billion

Page 30: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

30

Four Key Challenges Parallelism

Memory and Storage efficiencies

Reliability

Energy Consumption

Page 31: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

31

Performance from Parallelism

Page 32: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

32

Goals Deliver two exascale systems by 2023– First in 2021 based on “advanced architecture”– 20-30 MW– Sufficiently resilient– Support a broad range of workloads

Page 33: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

33

Strategic Pillars National Security

Energy Security

Economic Security

Scientific Discovery

Earth System

Health Care

Page 34: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

34

Software and Hardware Goals Software

Hardware

Page 35: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

35

Schedule

Page 36: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

36

Current Status Recent Actions

Risks

Page 37: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

37

Alternative Architectures What does this mean?

Options?

Page 38: High Performance Computing Systemsdshook/cse566/lectures/Exascale.pdf · Programming Models ... Deliver two exascale systems by 2023 – First in 2021 based on “advanced architecture”

38

Course Recap / Discussion