47pt live migration with sr-iov pass-through...additional time of vf suspend and resume via vf...
TRANSCRIPT
HUAWEI TECHNOLOGIES CO., LTD.
www.huawei.com
Huawei Confidential
Security Level:
47pt
30pt
反白
:
FrutigerNext LT Medium
: Arial
47pt
黑体
28pt
反白
细黑体
Weidong Han <[email protected]>
August, 2015
Live Migration with SR-IOV Pass-through
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 2
Agenda
Background
Prototype
Evaluation
Summary
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 3
Background
VM live migration is one of the most important feature of
virtualization
SR-IOV migration is required
NIC becomes more powerful: 10Gbit - > 40Gbit -> 100Gbit
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 4
Live Migration Algorithm
Connect
Send Config
Enable dirty page logging
Send dirty pages iteratively
Suspend
Send last dirty pages
Send VM state
Disconnect
Destroy VM
Create VM
Receive memory
Receive memory
Restore VM state
Unpause VM
Source Destination
VM
Downtime
Migration
Time
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 5
Challenges
How to migrate hardware state of the assigned device?
Some registers of existing NICs are not writable
Bonding driver (VF and virtio-net) in VM
Performance is not consistent
CPU consumption is not consistent
Hot plugging device increases downtime
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 6
Agenda
Background
Prototype
Evaluation
Summary
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 7
Ideally, Hardware can help
I/O registers are readable and writable
NIC Driver provides suspend and resume functions
Suspend: save hardware state
Resume: restore hardware state
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 8
Prototype Overview
Host
VM
libvirt
QEMU IProute2
PF driver
VF driver
PF … VF VF VF
Libvirt
Migration check, prepare VM config
QEMU
Implement savevm handlers (save and load) for
assigned device
Use IProute2 command to notify SR-IOV driver
for migration
Iproute2
Add commands: migrate, cancelmigration,
suspend, resume.
PF driver
Notify VF driver for migration operations
VF driver
DMA dirty page logging
Suspend and resume VF state
Note: based on a Huawei NIC prototype
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 9
Live Migration Algorithm with SR-IOV
Pass-through
Connect
Send Config
Enable dirty page logging
Send dirty pages iteratively
Suspend
Send last dirty pages
Send VM state
Disconnect
Destroy VM
Create VM
Receive memory
Receive memory
Restore VM state
Unpause VM
Source Destination
Notify VF driver to
track DMA dirty
pages
Configure
destination VF
Notify VF driver to
suspend assigned
VF; save state of
assigned device in
QEMU
Load state of
assigned VF in
QEMU;
Notify VF driver to
resume VF
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 10
Iproute2 Migration Commands
Iproute2 can set VF state from kernel 3.12
#ip link set <pf> vf <vf_index> state auto|enable|disable
Extend iproute2 VF state set commands
#ip link set <pf> vf <vf_index> state auto|enable|disable| migrate|cancelmigration|suspend|resume
PF driver receives migration commands from iproute2, and
passes them to VF driver via mailbox
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 11
DMA Dirty Pages Logging
Memory access by DMA can not be tracked by page table (e.g
EPT)
VF driver uses dummy writes (read and write a byte at the
same address) to make it dirty, then the memory can be
tracked
It almost doesn’t impact the performance
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 12
VF State Migration
VF suspend
VF driver saves internal hardware states, and down interface
QEMU saves states of assigned VF via registered savevm handlers
VF resume
QEMU restores states of assigned VF via registered savevm handlers
VF driver restores internal hardware states, up interface, and sends ARP.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 13
Agenda
Background
Prototype
Evaluation
Summary
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 14
Test Environment
Host
CPU: Huawei RH2288v2 (Xeon CPU E5-2620 [email protected])
NIC:
Huawei smart NIC prototype (for pass-through)
Broadcom Corporation NetXtreme BCM5719 Gigabit (VM data transfer for
migration)
Storage: Huawei OceanSpace S5500T, through IPSAN
VM
SLES11 SP3 64bit, 4 CPU, 4GB Memory
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 15
Results
VM migration time and downtime impact of our prototype is
little.
9000
9050
9100
9150
9200
9250
9300
9350
VM with virtio-net
(idle)
VM with virtio-net
(1G)
VM with VF
(idle)
VM with VF
(1G)
VM with VF
(5G)
VM Migration Time (ms)
0
20
40
60
80
100
120
140
160
180
VM with virtio-net
(idle)
VM with virtio-net
(1G)
VM with VF
(idle)
VM with VF
(1G)
VM with VF
(5G)
VM Downtime (ms)
Note: tested with default qemu max_downtime
set, here is not the minimal downtime
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 16
Results (cont.)
Normally the network downtime of VM with VF is a bit of larger than VM with
virtio-net
Additional time of VF suspend and resume via VF driver: suspend time is about 5ms, resume time
is about 20ms (need optimization)
The network downtime with 5G workload case is big (need fixing)
141.4 120.6 143.8 157.6 139.4 196
282 362 386
1622
0
200
400
600
800
1000
1200
1400
1600
1800
VM with virtio-net
(idle)
VM with virtio-net
(1G) VM with VF (idle) VM with VF (1G) VM with VF (5G)
VM Downtime (ms)
VM Network Downtime (ms)
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 17
Agenda
Background
Prototype
Evaluation
Summary
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential
35pt
32pt
) :18pt
Page 18
Summary
Demonstrate a prototype of SR-IOV migration with hardware
and driver help
The evaluation results show it basically performs well
Need improvements
Hope more future NICs will be friendly to live migration!
Thank you www.huawei.com