vmotion for nvidia grid vgpu virtual machines: case study ......lan vu, uday kurkure vmotion for...
TRANSCRIPT
1Confidential │ ©2019 VMware, Inc.
GTC 2019
Hari Sivaraman, Dimitrios SkarlatosLan Vu, Uday Kurkure
vMotion for NVIDIA GRID vGPU Virtual Machines: Case Study of vMotion Using MLaaS
Confidential │ ©2019 VMware, Inc.
Agenda
2
vMotion for NVIDIA GRID vGPU - Agenda
• GPUs in vSphere.
• vMotion for vGPU Architecture.
• Performance of vMotion for vGPU.
• MLaaS – a case study for vMotion performance.
• Conclusions and future work.
Confidential │ ©2019 VMware, Inc.
Agenda
3
vMotion for NVIDIA GRID vGPU – GPUs in vSphere
vSphereHypervisor
GPUGPU GPU
VMware DirectPath I/O
Virtual Machine
Guest OS
GPU driver
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Pass-throu
gh
Pass-throu
gh
Pass-throu
gh
GPU
Pass-throu
gh
vSphereHypervisor
vGPU
Virtual MachineGuest OS
GPU driver
Applications
Virtual MachineGuest OS
GPU driver
Applications
Virtual MachineGuest OS
GPU driver
Applications
Virtual MachineGuest OS
GPU driver
Applications
Nvidia GRIDvGPU manager
vGPU
Nvidia GRID vGPU
Virtual MachineGuest OS
GPU driver
Applications
Virtual MachineGuest OS
GPU driver
Applications
Virtual MachineGuest OS
GPU driver
Applications
vGPUvGPU
GRIDGPU
vGPU vGPU vGPU vGPU
vMotion Sharing
vMotion Sharing
vMotion Sharing
vSphereHypervisor
Virtual Machine
Guest OS
VMware GPU driver
Applications
Nvidia Driver
GPU
vSGAVirtual
Machine
Guest OS
VMware GPU driver
Applications
Confidential │ ©2019 VMware, Inc.
Agenda
4
vMotion for NVIDIA GRID vGPU – vGPU
Hypervisor
Virtual Machine
Guest OS
Applications
Virtual Machine
Guest OS
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Nvidia GRIDvGPU manager
Nvidia GRID vGPUVirtual Machine
Guest OS
GPU driver
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Virtual Machine
Guest OS
GPU driver
Applications
Scheduler vGPU Dedicated device memory
vGPU
vGPU Dedicated device memoryvGPU Dedicated
device memory
vGPU
• GPU Memory is statically shared
• GPU memory per VM is called vGPU Profile
• For example: P40-1q profile for P40 GPU - vGPU has 1GB of device memory - 24 vGPUs per 1 physical P40
• CUDA cores are time-shared
Confidential │ ©2019 VMware, Inc.
Agenda
5
vMotion for NVIDIA GRID vGPU – Types of vMotion
vMotion Network
Datastore
SourceESX Host
Destination
ESX Host
VMware ESX
VMware ESXi & ESX
VMware ESXi & ESX
vMotion
Confidential │ ©2019 VMware, Inc.
Agenda
6
vMotion for NVIDIA GRID vGPU – vMotion
pre-copy memory pages 1
Stun the VM2
Checkpoint devices3
Xfer device checkpoint data (includes vGPU memory data)4
Power on VM & xfer pages from main memory5
VMware ESXi & ESX VMware ESXi & ESX
vMotion
Confidential │ ©2019 VMware, Inc.
Agenda
7
vMotion for NVIDIA GRID vGPU - Agenda
• GPUs in vSphere.
• vMotion for vGPU Architecture.
• Performance of vMotion for vGPU.
• MLaaS – a case study for vMotion performance.
• Conclusions and future work.
Confidential │ ©2019 VMware, Inc.
Agenda
8
vMotion for NVIDIA GRID vGPU - Workloads
VMware vSphere Cloud Hosted CAD
MLaaS
VDI
Cloud Hosted CAD
Confidential │ ©2019 VMware, Inc.
Agenda
9
vMotion for NVIDIA GRID vGPU – Test-bed
VMware ESXi 6.7u1
Dell R730 – Intel Broadwell CPUs + 1 x NVidia GRID P4040 cores (2 x 20-core socket) E5-2698 v4768 GB RAM
• ESX: 6.7u1 Nvidia Driver: 410.68
VMware ESXi 6.7u1
Dell R730 – Intel Broadwell CPUs + 1 x NVidia GRID P4040 cores (2 x 20-core socket) E5-2698 v4768 GB RAM
Switch
10Gb
E
10Gb
E
Confidential │ ©2019 VMware, Inc.
Agenda
10
vMotion for NVIDIA GRID vGPU – Performance of Word
Increase in vMotion time due to vGPU is just marginally more than measurement noise.
Confidential │ ©2019 VMware, Inc.
Agenda
11
vMotion for NVIDIA GRID vGPU – Performance of Word
Increase in vMotion time due to vGPU is just marginally more than measurement noise.
Confidential │ ©2019 VMware, Inc.
Agenda
12
vMotion for NVIDIA GRID vGPU – Performance of SPECapc for 3dsmax 2015
Benchmark: SPEcapc for 3dsmask 2015
Software: Autodesk 3dsmax 2015
Negligible increase in run-time due to vMotion!
Confidential │ ©2019 VMware, Inc.
Agenda
13
vMotion for NVIDIA GRID vGPU – Performance of SPECapc for 3dsmax 2015
Benchmark: SPEcapc for 3dsmask 2015
Software: Autodesk 3dsmax 2015
Negligible increase in run-time due to vMotion!
Confidential │ ©2019 VMware, Inc.
Agenda
14
vMotion for NVIDIA GRID vGPU – Performance of SPECapc for 3dsmax 2015
Confidential │ ©2019 VMware, Inc.
Agenda
15
vMotion for NVIDIA GRID vGPU – Performance of SPECapc for 3dsmax 2015
Confidential │ ©2019 VMware, Inc.
Agenda
16
vMotion for NVIDIA GRID vGPU - Agenda
• GPUs in vSphere.
• vMotion for vGPU Architecture.
• Performance of vMotion for vGPU.
• MLaaS – a case study for vMotion performance.
• Conclusions and future work.
Confidential │ ©2019 VMware, Inc. 17
Revenues from the Artificial Intelligence (AI) market worldwide from 2016 to 2025
The largest proportion of revenues come from the ML/AI Enterprise Applications
Confidential │ ©2019 VMware, Inc. 18
ML/AI Enterprise Application Deployment
Enterprise Datacenter / Clouds
ML/AIApp
ML/AIApp
ML/AIApp
Machine Learning as a Service GPUs
FPGAs
CPUs
Confidential │ ©2019 VMware, Inc. 19
Machine Learning as a Service
Example #1 of deploying MLaaS on VMware vSphere
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs
…
Virtual Machine
ML Frameworks
GPUs
Pass-Through
Confidential │ ©2019 VMware, Inc. 20
Machine Learning as a Service
Example #2 of deploying MLaaS on VMware vSphere
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs
…
Virtual Machine
ML Frameworks
GPUs
Mediated Pass-Through
vGPUvGPUNVIDIA GRID
Confidential │ ©2019 VMware, Inc. 21
Machine Learning as a Service
Example #3 of deploying MLaaS on VMware vSphere with Container
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs
Virtual Machine
ML Frameworks
GPUs
vGPUvGPUNVIDIA GRID
Docker Container Docker Container
…
Confidential │ ©2019 VMware, Inc. 22
Machine Learning as a Service
Example #4 of deploying MLaaS on VMware vSphere with Container & Kubernetes
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs GPUs
vGPUNVIDIA GRID
Docker Container …Kubernetes Worker
Virtual Machine
Kubernetes Master
Confidential │ ©2019 VMware, Inc. 23
Machine Learning as a Service
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs GPUs
vGPUNVIDIA GRID
Docker Container …Kubernetes Worker
Virtual Machine
Kubernetes Master
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs GPUs
vGPUvGPUNVIDIA GRID
Docker Container …Kubernetes Worker
Virtual Machine
ML Frameworks
Docker Container
Kubernetes Worker
Example #4 of deploying MLaaS on VMware vSphere with Container & Kubernetes
Confidential │ ©2019 VMware, Inc. 24
Experiments of MLaaS on VMware vSphereHardware and Software
VMware ESXi 6.5
Dell R730 with Intel Haswell CPUs (36 cores) + NVIDIA P40 GPU
VMware ESXi 6.5
Intel Haswell CPUs1VM with 18 vCPU
Request Prediction
Receive Response
MLaaS Clients
Confidential │ ©2019 VMware, Inc. 25
Experiment #1: Inference ThroughputDeep Neural Network: Inception V3 vs. MobileNet – Higher is better
Models:Inception V3
48 Layers 5000 Million MAC
MobileNet:28 Layers
569 Million MAC
MobileNet
Confidential │ ©2019 VMware, Inc. 26
Experiment #1: Inference Mean LatencyDeep Neural Network: Inception V3 vs. MobileNet
Models:Inception V3
48 Layers 5000M MAC
MobileNet:28 Layers
569 Million MAC
Confidential │ ©2019 VMware, Inc. 27
Experiment #2: Inference Throughput
(36 CPU cores) ( 8 CPU cores & 1 GPU)
Higher is better
Confidential │ ©2019 VMware, Inc. 28
Experiment #2: Mean Inference Latency
(36 CPU cores) ( 8 CPU cores & 1 GPU)
Lower is better
Confidential │ ©2019 VMware, Inc. 29
Machine Learning as a Service
vMotion for NVIDIA GRID vGPU - MLaaS
VMware vSphere
Virtual Machine
Physical Server
ML Frameworks
CPUs GPUs
vGPUNVIDIA
GRID
Docker Container
Kubernetes Worker
VMware vSphere
Physical ServerCPUs GPUs
vGPUNVIDIA
GRID
ClientClient
ClientClient vMotion
Confidential │ ©2019 VMware, Inc.
Agenda
30
vMotion for NVIDIA GRID vGPU – Test-bed
VMware ESXi 6.7u1
Dell R730 – Intel Broadwell CPUs + 1 x NVidia GRID P4040 cores (2 x 20-core socket) E5-2698 v4768 GB RAM
• ESX: 6.7u1 Nvidia Driver: 410.68
VMware ESXi 6.7u1
Dell R730 – Intel Broadwell CPUs + 1 x NVidia GRID P4040 cores (2 x 20-core socket) E5-2698 v4768 GB RAM
Switch
10Gb
E
10Gb
E
Confidential │ ©2019 VMware, Inc.
Agenda
32
vMotion for Nvidia GRID vGPU: Conclusions and Upcoming Improvements
• vMotion for Nvidia GRID vGPU is now available
Conclusions:
Upcoming Improvements:• Speedup xfer rate of device checkpoint and vGPU memory data.
• The performance impact of vMotion on VDI, CAD and ML applications is negligible or small.
• The performance impact of multiple vMotions running concurrently is small.
• Pre-copy vGPU memory data to reduce stun time to meet or exceed vMotion’s standard of 1 second.