status update about colo ft - linux-kvm.org · huawei technologies co., ltd. status update about...
Post on 12-May-2018
222 Views
Preview:
TRANSCRIPT
HUAWEI TECHNOLOGIES CO., LTD.
www.huawei.com
Status Update About COLO FT
Hailiang Zhang (Huawei)
Randy Han (Huawei)
Agenda
Introduce COarse-grain LOck-stepping
COLO Design and Technology Details
Current Status Of COLO In KVM
Further Work About COLO
HUAWEI TECHNOLOGIES CO., LTD.
Non-Stop Service with VM Replication
Virtual Machine (VM) replication
A software solution for business continuity and disaster recovery
through application-agnostic hardware fault tolerance by replicating the
state of primary VM (PVM) to secondary VM (SVM) on different physical
node.
HUAWEI TECHNOLOGIES CO., LTD.
Existing VM Replication Approaches
4
Replication Per Instruction: Lock-stepping
Execute in parallel for deterministic instructions
Lock and step for un-deterministic instructions
Replication Per Epoch: Continuous Checkpoint
Secondary VM is synchronized with Primary VM per
epoch
Output is buffered within an epoch
HUAWEI TECHNOLOGIES CO., LTD.
Problems
5
Lock-stepping
Excessive replication overhead
memory access in an MP-guest is un-deterministic
Continuous Checkpoint
Extra network latency
Excessive VM checkpoint overhead
HUAWEI TECHNOLOGIES CO., LTD.
What Is COLO
VM and Clients model
VM and Clients are a system of networked request-response system
Clients only care about the response from the VM
COarse-grain LOck-stepping VMs (COLO)
PVM and SVM execute in parallel
Duplicates client’s request stream to SVM
Compare the output packets from PVM and SVM
Synchronize SVM state with PVM when their responses (network
packets) are not identical
HUAWEI TECHNOLOGIES CO., LTD.
Why Is COLO Better
7
Comparing with Continuous VM checkpoint
No buffering-introduced latency
Less checkpoint frequency
On demand vs periodic
Comparing with lock-stepping
Eliminate excessive overhead of un-deterministic instruction
execution due to MP-guest memory access
Agenda
Introduce COarse-grain LOck-stepping
COLO Design and Technology Details
Current Status Of COLO In KVM
Further Work About COLO
HUAWEI TECHNOLOGIES CO., LTD.
Architecture Of COLO
9
Primary VM Heartbeat
Qemu
Failover
VM Checkpoint
COLO Disk
manger
Virtual Machine MonitorCOLO proxy
(compare packets)
Kernel
Storage External Network
Primary Node
Virtual Machine Monitor
Kernel
StorageExternal Network
Secondary Node
Secondary VMHeartbeat
Qemu
Failover
VM Checkpoint
COLO Disk
manger
Internal Network
Disk IO
Net IO
COLO proxy
(adjust sequence)
COarse-grain LOck-stepping Virtual Machine for Non-stop Service
HUAWEI TECHNOLOGIES CO., LTD.
How Block Replication Work
10
From SVM’s point of view: Its storage is disk-buffer whose parent backing file is
Secondary Disk (Or Shared Disk)
Checkpoint: Disk buffer will be emptied to achieve block replication
Failover: Disk buffer will be written back to the ‘parent’ disk
Shared disk workflowNon-Shared disk workflow
Primary Write
requests
Shared Disk
Secondary Write
Requests
Disk Buffer
Backing file
Read & Forward
Primary Write
requests
Primary Disk
Secondary Write
Requests
Disk Buffer
Secondary
Disk
Copy & Forward
HUAWEI TECHNOLOGIES CO., LTD.
VM State Checkpointing
11
Based on live migration
PVM’s memory/device
data be stored in extra
memory-buffer of SVM
before be synchronized to
SVM
Need
checkpoint ?
Send ‘checkpoint-
request’
Wait for new
checkpoint
Suspend
Send ‘checkpoint-
reply’Wait for reply
Suspend
Migrate VM’s stateStore VM’s state into
memory buffer
Update SVM’s state from
memory buffer
Send ‘vmstate-load’Wait for SVM update
Resume Resume
VM Running
VM Running
Secondary NodePrimary Node
Execution and Checkpoint Flow in COLO
HUAWEI TECHNOLOGIES CO., LTD.
COLO Proxy Design
Scheme:
Kernel scheme: Based on kernel TCP/IP stack and netfilter component
Can support vhost-net, virtio, e1000, rtl8139, etc
Better performance but less flexible (Need modify
netfilter/iptables and kernel)
Userspace scheme: Totally realized in QEMU
Based on QEMU’s netfilter components and SLIRP component
Not support vhost-net, but e1000, rtl8139
More flexible
HUAWEI TECHNOLOGIES CO., LTD.
Proxy Design (Userspace scheme)
13
Filter mirror: copy and forward client’s packets to SVM
Filter redirector: redirect net packets
COLO compare: compare PVM’s and SVM’s net packets;
Filter rewriter: adjust tcp packets’ ack and tcp packets’ seq
Guest
Filter
mirror
Filter
redirector
COLO
compare
tap
Guest
Filter
redirector
Adjust
ack
Adjust
seq
Filter
redirector
If identical then release packets
Primary QEMU Secondary QEMU
tap
Client
tx rxall
Filter
rewriter
TCPFilter excute order
tx rx
Filter excute order
HUAWEI TECHNOLOGIES CO., LTD.
COLO Performance In KVMPerformance (Based on kernel proxy)
14
90.07%
82.13%
77.83%
86.08%
92.98%
96.93%
86.51%89.10%
50.00%
55.00%
60.00%
65.00%
70.00%
75.00%
80.00%
85.00%
90.00%
95.00%
100.00%
WEBBENCH FTP GET FTP PUT NETPERF TCPNETPERF UDP SIEGE PGBENCH Kernel build
The experimental data is normalized to the native system
Agenda
Introduce COarse-grain LOck-stepping
COLO Design and Technology Details
Current Status Of COLO In KVM
Further Work About COLO
HUAWEI TECHNOLOGIES CO., LTD.
Status of COLO In KVM
16
COLO Framework:
Include VM state checkpoint process, failover process
Patch set v18 had been post
COLO block replication:
Only including non-shared storage replication scheme
Already been merged to branch
https://github.com/stefanha/qemu/commits/block-next
COLO proxy:
netfilter base/buffer/mirror plugins have been merged
Userspace packets compare is WIP and v11 version has been posted
Agenda
Introduce COarse-grain LOck-stepping
COLO Design and Technology Details
Current Status Of COLO In KVM
Further Work About COLO
HUAWEI TECHNOLOGIES CO., LTD.
TODO
18
Continuous VM replication development
Support shared storage
Optimize performance
Reduce VM’s downtime while do checkpoint
Improve storage and network performance
Implement the heartbeat component
Support COLO in libvirt
Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation,
statements regarding the future financial and operating results, future product portfolio, new technology,
etc. There are a number of factors that could cause actual results and developments to differ materially
from those expressed or implied in the predictive statements. Therefore, such information is provided for
reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the
information at any time without notice.
Thank youwww.huawei.com
top related