the rcuda technology: improvements towards a production ... · the rcuda technology: improvements...

30
The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València Spain

Upload: others

Post on 28-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

The rCUDA Technology:

Improvements Towards a

Production Ready Software

Federico SillaUniversitat Politècnica de València

Spain

Page 2: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 2/30

Outline

1. The rCUDA technology

2. P2P copies between GPUs

3. GPU job migration within rCUDA

4. Support for Deep Learning

5. Making rCUDA ready to industry

Page 3: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 3/30

Outline

1. The rCUDA technology

2. P2P copies between GPUs

3. GPU job migration within rCUDA

4. Support for Deep Learning

5. Making rCUDA ready to industry

Page 4: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 4/30

rCUDA… CUDA… they sound similar

Page 5: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 5/30

rCUDA… remote CUDA

rCUDA is a software technology that enables a more flexible use of GPUs

rCUDA allows a new envision of a GPU deployment:

Physical

configuration

Logicalconnections

Logical

configuration

Interconnection Network

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node nnode 2 node 3node 1

RAM RAM RAM RAM

node nnode 2 node 3node 1

Interconnection Network

Network

CPU

CPU RAM

RAM

Network

CPU

CPU RAM

RAM

Network

CPU

CPU RAM

RAM

Network

PC

Ie

PC

Ie

PC

Ie

PC

Ie

CPU

CPU RAM

RAM

GPU RAM GPU RAM GPU RAM GPU RAM

Page 6: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 6/30

No GPU

rCUDA is a development by Universitat Politècnica de València, Spain

Basics of rCUDA

Page 7: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 7/30

Basics of rCUDA

rCUDA is a development by Universitat Politècnica de València, Spain

Page 8: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 8/30

Basics of rCUDA

rCUDA is a development by Universitat Politècnica de València, Spain

Page 9: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 9/30

Performance of rCUDA

“Ideas Are Easy, Implementation Is Hard”

Guy Kawasaki, marketing specialist and

Silicon Valley venture capitalist

Page 10: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 10/30

Performance of rCUDA

CPU to GPU

GPU to CPU

Higher

is better

Page 11: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 11/30

Performance of rCUDA

CPU to GPU

GPU to CPU

Higher

is better

Page 12: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 12/30

Performance of rCUDA

CPU to GPU

GPU to CPU

Higher

is better

Page 13: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 13/30

Performance of applications using rCUDA

• K20 GPU and FDR InfiniBand

• K40 GPU and EDR InfiniBand

Lower

is better

Page 14: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 14/30

Performance of applications using rCUDA

EDR InfiniBand and P100 GPU

CUDA-MEME

BarraCUDA

Lower

is better

Lower

is better

Page 15: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 15/30

Outline

1. The rCUDA technology

2. P2P copies between GPUs

3. GPU job migration within rCUDA

4. Support for Deep Learning

5. Making rCUDA ready to industry

Page 16: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 16/30

Why is P2P copy support needed?

rCUDA

model

CUDA

model

rCUDA scenario 1

rCUDA scenario 2

rCUDA must

provide the same

semantics as CUDA

Page 17: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 17/30

Performance of P2P copies

Higher

is better

“Ideas Are Easy, Implementation Is Hard” Guy Kawasaki, marketing specialist and

Silicon Valley venture capitalist

Page 18: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 18/30

Outline

1. The rCUDA technology

2. P2P copies between GPUs

3. GPU job migration within rCUDA

4. Support for Deep Learning

5. Making rCUDA ready to industry

Page 19: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 19/30

Server consolidation

1

1

37

13

14

14

Page 20: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 20/30

GPU-job migration within rCUDA

• rCUDA provides support for migrating jobs from one GPU in the cluster to another GPU located at the same or different node

• Only the GPU part of the application is migrated. The CPU part is not moved around

• Migration is transparent to applications, which are not aware that their GPU data and kernels have moved from one GPU to another

• When several jobs are sharing a given GPU, it is possible to migrate each of them independently to different destination GPUs

Page 21: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 21/30

Example of migration performance

The GPU-Blast application is migrated up to 5 times among K40 GPUs

• The aggregated volume of GPU data is 1300 MB (consisting of 9 memory regions)

The “Reference” line is the execution time of the application when using CUDA with a local GPU and without any migration

Lower

is better

Page 22: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 22/30

Outline

1. The rCUDA technology

2. P2P copies between GPUs

3. GPU job migration within rCUDA

4. Support for Deep Learning

5. Making rCUDA ready to industry

Page 23: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 23/30

Deep Learning with CUDA

work in progress!!

Caffe

Page 24: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 24/30

Deep Learning with rCUDA

very preliminary

results

Caffe

Page 25: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 25/30

Outline

1. The rCUDA technology

2. P2P copies between GPUs

3. GPU job migration within rCUDA

4. Support for Deep Learning

5. Making rCUDA ready to industry

Page 26: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 26/30

RoCE

CPU to GPU

GPU to CPU

Page 27: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 27/30

Scheduling the shared use of GPUs

Interconnection Network

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node n

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 1

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 2

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 3

RAM RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 4

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 5

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 6

RAM

Network

GPU

PC

Ie

CPU

CPU RAM

RAM

node 7

RAM

Logicalconnections

Interconnection Network

Network

CPU

CPU RAM

RAM

node n

Network

CPU

CPU RAM

RAM

node 1

Network

CPU

CPU RAM

RAM

node 2

Network

CPU

CPU RAM

RAM

node 3

Network

CPU

CPU RAM

RAM

node 4

Network

CPU

CPU RAM

RAM

node 5

Network

CPU

CPU RAM

RAM

node 6

Network

PC

Ie

PC

Ie

PC

Ie

PC

Ie

PC

Ie

PC

Ie

PC

Ie

PC

Ie

CPU

CPU RAM

RAM

node 7

GPU RAM GPU RAM GPU RAM GPU RAMGPU RAM GPU RAM GPU RAM

Which GPU should I use?

Page 28: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 28/30

Get a free copy of rCUDA at

http://www.rcuda.net

@rcuda_

More than 800 requests world wide

rCUDA is a development by Technical University of Valencia

Page 29: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 29/30

Get a free copy of rCUDA at

http://www.rcuda.net

@rcuda_

More than 800 requests world wide

Jaime Sierra Pablo Higueras

rCUDA is a development by Technical University of Valencia

Carlos Reaño Javier Prades Tony Díaz

Page 30: The rCUDA Technology: Improvements Towards a Production ... · The rCUDA Technology: Improvements Towards a Production Ready Software Federico Silla Universitat Politècnica de València

HPC Advisory Council Swiss Conference 2017 30/30

Thanks!

Questions?

rCUDA is a development by Technical University of Valencia