understanding the use cases driving nvme-over-fabrics adoption · 12/5/2018  · gpu farm: nvidia...

13
Understanding the Use Cases Driving NVMe-over-Fabrics Adoption Julie Herd Director, Technical Marketing E8 Storage NVMe Developer Days 2018 San Diego, CA 1

Upload: others

Post on 14-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Understanding the Use Cases Driving NVMe-over-Fabrics Adoption

Julie Herd Director, Technical Marketing

E8 Storage

NVMe Developer Days 2018 San Diego, CA

1

Page 2: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

NVMe Market Adoption

§  Adoption shifting over time •  PC / Servers continue to lead •  Storage adoption ramping

§  NVMe-oF for Storage •  Standard critical for ramp •  Major players entering market

NVMe Developer Days 2018 San Diego, CA

2

Page 3: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Early NVMe Use Cases

Applications that drive business revenue §  Financial Analytics §  Genome Research §  Artificial Intelligence / Machine Learning / Deep Learning §  Fluid Dynamics

NVMe Developer Days 2018 San Diego, CA

3

Page 4: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

NVMe for Financial Analytics

§  “Latency is King”

§  FinTech driving automated trading

§  Real-time transaction processing

§  Daily trade analytics

NVMe Developer Days 2018 San Diego, CA

4

Page 5: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Market Data Analytics for Financials

Before •  1152 local SSDs in 72 servers •  Market data copied nightly to all servers •  Restricted to 10TB-20TB

After •  48 SSDs in 2 E8-D24 appliances •  Market data shared to all 72 servers •  Easily scalable to 300TB

In production with 2 of the world’s Top-10 largest hedge funds

NVMe Developer Days 2018 San Diego, CA

5

70%Costreduction!

Page 6: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Accelerating the Science of Life

§  Primary genome sequencing •  Historically – 10 hours per genome •  NVMe – 10 genomes per hour!

§  Secondary genomic analysis •  Analyzing genomes per population •  Requires TB of data

NVMe Developer Days 2018 San Diego, CA

6

Page 7: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Genomic Acceleration

"We were keen to test E8 by trying to integrate it with our Univa Grid Engine cluster as a consumable resource of ultra-performance scratch space. Following some simple tuning and using a single EDR link we were able to achieve about 5GB/s from one node and 1.5M 4k IOPS from one node. Using the E8 API we were quickly able to write a simple Grid Engine prolog/epilog that allowed for a user-requestable scratch volume to be automatically created and destroyed by a job. The E8 box behaved flawlessly and the integration with InfiniBand was simpler than we could have possibly expected for such a new product." Dr. Robert Esnouf, Director of Research Computing Oxford Big Data Institute + Wellcome Center for Human Genetics

Shared NVMe as a fast tier for parallelizing genomic processing

NVMe Developer Days 2018 San Diego, CA

7

Page 8: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Example Architecture

NVMe Developer Days 2018 San Diego, CA

8

Page 9: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Artificial Intelligence / Machine Learning

§  Training phase is critical to AI •  Massive amounts of data required for deep learning •  Fast storage required to keep expensive GPUs busy

§  Data profile •  Large and small I/O •  Millions of files •  Low latency

NVMe Developer Days 2018 San Diego, CA

9

Page 10: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

AI/ML with IBM GPFS and NVIDIA

10

Shared NVMe Accelerates Deep Learning GPUFarm:NvidiaDGX-1•  Upto8GPUspernode•  GPFSClient+E8Agentrunon

x86withinGPUServer•  Upto126GPUnodesincluster

Mellanox100GIB

SharedNVMeStorage•  E8-D242U24-HA•  Dual-port2.5”NVMeDrives•  Upto184TB(raw)per2U•  PatentedDistributedRAID6

NVMe Developer Days 2018 San Diego, CA

Page 11: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Fluid Dynamics and EDA

§  Electronic designs are validated using fluid dynamics •  High performance clusters •  High throughput and low latency are key

§  Examples •  Formula 1 •  Architecture •  Airplanes

NVMe Developer Days 2018 San Diego, CA

11

Page 12: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

High Frequency Trading with Storage Class Memory

Before •  20TB cache RAM dedicated to 13

servers •  20µs latency over 40G Infiniband •  26U @ 9100W storage solution

After •  24 Intel Optane™ 1TB SSDs in E8-X24 •  Data shared to all database nodes •  2U @ 1200W storage solution

Extreme performance for real-time market adjustments

12 NVMe Developer Days 2018 San Diego, CA

Page 13: Understanding the Use Cases Driving NVMe-over-Fabrics Adoption · 12/5/2018  · GPU Farm: Nvidia DGX-1 • Up to 8 GPUs per node • GPFS Client + E8 Agent run on x86 within GPU

Thank You

Twitter: @JulieHerd [email protected]

NVMe Developer Days 2018 San Diego, CA

13