cuda cloud: enabling hpc workloads in openstack | gtc 2013 · openstack background ! openstack...

21
CUDA in the Cloud – Enabling HPC Workloads in OpenStack John Paul Walters Computer Scien5st, USC Informa5on Sciences Ins5tute [email protected] With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

Upload: others

Post on 26-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

CUDA in the Cloud – Enabling HPC Workloads in OpenStack

John  Paul  Walters  Computer  Scien5st,  USC  Informa5on  Sciences  Ins5tute  

[email protected]  With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

Page 2: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

2

Outline

§  Introduction §  OpenStack Background §  Supporting Heterogeneity §  Supporting GPUs §  Performance Results §  Current Status §  Future Work

Page 3: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

3

Introduction

§  Cloud computing is traditionally seen as a resource for IT –  Web servers, databases

§  More recently researchers have begun to leverage the public cloud as an HPC resource –  AWS virtual cluster is 102 on Top500 list

§  Major difference between HPC and IT in the cloud: –  Types of resources, heterogeneity

§  Our contribution: we’re developing the heterogeneous HPC extensions for the OpenStack cloud computing platform

Page 4: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

Why A Heterogeneous Cloud?

§  Typical  cloud  infrastructures  focus  on  commodity  hardware.  §  Choose  number  of  CPUs,  memory,  and  disk  §  ODen  not  a  good  match  for  technical  compu5ng  

§  Advantage  of  Heterogeneity  §  Leverage  GPUs  and  other  accelerators  §  Power  efficiency,  performance  §  Process  technical  workloads  

§  Move  away  from  batch/grid  schedulers  §  Give  users  access  to  customizable  instances  §  Dedicated  virtual  machine    

Page 5: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

Benefits of Heterogeneity

136.2 seconds 139.5 seconds

SGI  UV100  rendering  1926  objects  

Tilera  vs.  x86  video  transcoding  

Each  architecture  performs  well  on  different  applica5ons.  

CPU: 108 samples GPU: 1010 samples

Page 6: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

6

OpenStack Background §  OpenStack founded by

Rackspace and NASA §  In use by Rackspace, HP, and

others for their public clouds §  Open source with hundreds of

participating companies §  In use for both public and private

clouds §  Current stable release:

OpenStack Folsom –  OpenStack Grizzly to be

released in April

0  

20  

40  

60  

80  

100  

120  

Google  Trends  Searches  for  Common  Open  Source  IaaS  Projects  openstack  cloudstack  opennebula  eucalyptus  cloud  

Page 7: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

7

OpenStack Architecture

Image Source: Ken Pepple (http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/)

Page 8: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

Supporting Heterogeneity

•  Current  cloud  services  have  limited  resource  op5ons  •  M1.large  =  7.5  GB  RAM,  2  virtual  cores,  850  GB  disk  •  x86  and  x86_64  only  

•  Users  should  be  able  to  customize  instance  types  to  needs  •  E.g.  16  GB  RAM,  4  virtual  cores,  SSE4.2,  2xKepler  GPUs  •  Hypervisor  type  

Page 9: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

9

Supporting GPUs §  We’re pursuing two approaches for GPU support in OpenStack

–  LXC support for container-based VMs •  Good for non-virtualization-friendly GPUs •  Near-native performance •  Can’t run non-Linux guests

–  Xen support for fully virtualized guests •  Fully virtualized guests, can run Windows, etc. •  Some overhead, especially for PCIe transfers

§  Our OpenStack work currently supports GPU-enabled LXC containers –  Xen is in development, preliminary results shown today

9

Page 10: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

10

Integrating GPU Support into OpenStack §  Our GPU work is based off of the OpenStack’s libvirt

module –  Libvirt supports KVM/QEMU, LXC –  Libvirt-based Xen support is experimental

§  After instances boot, LXC’s GPUs are whitelisted and their major/minor device IDs are passed into the VM

§  GPU Instances are launched just like any other: Euca-run-instances –k <my key> -t cg1.small <machine image>

cg1 instance types represent GPUs

Page 11: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

11

Test Results: LXC and Xen §  3 data sets (all using CUDA 5)

–  NVIDIA CUDA samples –  SHOC multi-GPU –  BFS/Graph500 multi-GPU

LXC tests §  2x Xeon E5520 §  96 GB RAM §  1x Tesla S2050 (4 GPUs) §  RHEL 6.1

–  2.6.38.2 kernel (non-stock)

Xen Tests §  2x Xeon X5660 §  192 GB RAM §  2x NVIDIA Tesla C2075 §  Centos 6.3 (with Xen 4.1.2)

–  2.6.32-279 kernel (guest, stock)

Hardware

Page 12: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

12

LXC CUDA 5 Samples

0.92  

0.94  

0.96  

0.98  

1  

1.02  

RelaHv

e  Pe

rforman

ce  

Higher  is  BeKer  

LXC  CUDA  Samples,  Performance  RelaHve  to  Base  

Page 13: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

13

LXC Bandwidth Results

0  

1000  

2000  

3000  

4000  

5000  

Band

width,  M

B/s  

Data  Size,  KB  

LXC  vs.  Base,  Host  to  Device  Bandwidth,  Pinned  

Base  

LXC  VM  0  

500  1000  1500  2000  2500  3000  3500  

Band

width,  M

B/s  

Data  Size,  KB  

LXC  vs.  Base,  Device  to  Host  Bandwidth,  Pinned  

Base  

LXC  VM  

LXC mirrors base bandwidth

Page 14: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

14

LXC SHOC Results

0.86  0.88  0.9  

0.92  0.94  0.96  0.98  

1  1.02  

RelaHv

e  Pe

rforman

ce  

Higher  is  BeKer  

LXC  SHOC  Benchmarks,  1  Node  4  GPUs.    Performance  RelaHve  to  Base  

Page 15: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

15

0.0E+00  

5.0E+06  

1.0E+07  

1.5E+07  

2.0E+07  

2.5E+07  

3.0E+07  

3.5E+07  

16   17   18   19   20   21   22  

Med

ian  TEPS  

Problem  Size  

BFS/Graph500  2,4  GPUs,  1  Node  

LXC  2  GPUs  

Base  2  GPUs  

LXC  4  GPUs  

Base  4  GPUs  

LXC Graph500 Results

TEPS: Traversed Edges per Second

Page 16: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

16

Xen CUDA 5 Samples

0.92  0.94  0.96  0.98  

1  1.02  

RelaHv

e  Pe

rforman

ce  

Higher  is  BeKer  

Xen  CUDA  Samples,  Performance  RelaHve  to  Base  

Page 17: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

17

Xen Bandwidth Results

0  1000  2000  3000  4000  5000  6000  7000  

Band

width,  M

B/s  

Data  Size,  KB  

Xen  vs.  Base,  Host  to  Device  Bandwidth,  Pinned  

Base  

Xen  VM  

0  1000  2000  3000  4000  5000  6000  

Band

width,  M

B/s  

Data  Size,  KB  

Xen  vs.  Base,  Device  to  Host  Bandwidth,  Pinned  

Base  

Xen  VM  

Page 18: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

18

Xen SHOC Benchmarks

0.86  0.88  0.9  0.92  0.94  0.96  0.98  

1  1.02  

RelaHv

e  Pe

rforman

ce  

Higher  is  BeKer  

Xen  SHOC  Benchmarks,  1  Node    2  GPUs.    Performance  RelaHve  to  Base  

Page 19: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

19

Xen Graph500 Results

0.00E+00  

5.00E+06  

1.00E+07  

1.50E+07  

2.00E+07  

2.50E+07  

3.00E+07  

3.50E+07  

16   17   18   19   20   21   22  

Med

ian  TEPS  

Problem  Size  

BFS/Graph500,  2  GPUs,  1  Node  

Xen  VM  

Base  

TEPS: Traversed Edges per Second

Page 20: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

20

Current Status

§  Source code is available now –  https://github.com/usc-isi/nova

§  Includes support for heterogeneity –  GPU-enabled LXC instances –  Bare-metal provisioning –  Architecture-aware scheduler

§  We’re working towards integrating additional support into the OpenStack Grizzly and H-releases

Page 21: CUDA Cloud: Enabling HPC Workloads in OpenStack | GTC 2013 · OpenStack Background ! OpenStack founded by Rackspace and NASA ! In use by Rackspace, HP, and others for their public

21

Future Work

§  Add support for high speed networking as a resource –  Infiniband via SR-IOV –  Enable features like GPUdirect

§  Add support for the Kepler architecture §  Merge the Xen support work into Upstream OpenStack