status update about colo ft - linux-kvm.org · huawei technologies co., ltd. status update about...

19
HUAWEI TECHNOLOGIES CO., LTD. www.huawei.com Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

Upload: lamkiet

Post on 12-May-2018

222 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

www.huawei.com

Status Update About COLO FT

Hailiang Zhang (Huawei)

Randy Han (Huawei)

Page 2: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

Agenda

Introduce COarse-grain LOck-stepping

COLO Design and Technology Details

Current Status Of COLO In KVM

Further Work About COLO

Page 3: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Non-Stop Service with VM Replication

Virtual Machine (VM) replication

A software solution for business continuity and disaster recovery

through application-agnostic hardware fault tolerance by replicating the

state of primary VM (PVM) to secondary VM (SVM) on different physical

node.

Page 4: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Existing VM Replication Approaches

4

Replication Per Instruction: Lock-stepping

Execute in parallel for deterministic instructions

Lock and step for un-deterministic instructions

Replication Per Epoch: Continuous Checkpoint

Secondary VM is synchronized with Primary VM per

epoch

Output is buffered within an epoch

Page 5: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Problems

5

Lock-stepping

Excessive replication overhead

memory access in an MP-guest is un-deterministic

Continuous Checkpoint

Extra network latency

Excessive VM checkpoint overhead

Page 6: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

What Is COLO

VM and Clients model

VM and Clients are a system of networked request-response system

Clients only care about the response from the VM

COarse-grain LOck-stepping VMs (COLO)

PVM and SVM execute in parallel

Duplicates client’s request stream to SVM

Compare the output packets from PVM and SVM

Synchronize SVM state with PVM when their responses (network

packets) are not identical

Page 7: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Why Is COLO Better

7

Comparing with Continuous VM checkpoint

No buffering-introduced latency

Less checkpoint frequency

On demand vs periodic

Comparing with lock-stepping

Eliminate excessive overhead of un-deterministic instruction

execution due to MP-guest memory access

Page 8: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

Agenda

Introduce COarse-grain LOck-stepping

COLO Design and Technology Details

Current Status Of COLO In KVM

Further Work About COLO

Page 9: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Architecture Of COLO

9

Primary VM Heartbeat

Qemu

Failover

VM Checkpoint

COLO Disk

manger

Virtual Machine MonitorCOLO proxy

(compare packets)

Kernel

Storage External Network

Primary Node

Virtual Machine Monitor

Kernel

StorageExternal Network

Secondary Node

Secondary VMHeartbeat

Qemu

Failover

VM Checkpoint

COLO Disk

manger

Internal Network

Disk IO

Net IO

COLO proxy

(adjust sequence)

COarse-grain LOck-stepping Virtual Machine for Non-stop Service

Page 10: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

How Block Replication Work

10

From SVM’s point of view: Its storage is disk-buffer whose parent backing file is

Secondary Disk (Or Shared Disk)

Checkpoint: Disk buffer will be emptied to achieve block replication

Failover: Disk buffer will be written back to the ‘parent’ disk

Shared disk workflowNon-Shared disk workflow

Primary Write

requests

Shared Disk

Secondary Write

Requests

Disk Buffer

Backing file

Read & Forward

Primary Write

requests

Primary Disk

Secondary Write

Requests

Disk Buffer

Secondary

Disk

Copy & Forward

Page 11: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

VM State Checkpointing

11

Based on live migration

PVM’s memory/device

data be stored in extra

memory-buffer of SVM

before be synchronized to

SVM

Need

checkpoint ?

Send ‘checkpoint-

request’

Wait for new

checkpoint

Suspend

Send ‘checkpoint-

reply’Wait for reply

Suspend

Migrate VM’s stateStore VM’s state into

memory buffer

Update SVM’s state from

memory buffer

Send ‘vmstate-load’Wait for SVM update

Resume Resume

VM Running

VM Running

Secondary NodePrimary Node

Execution and Checkpoint Flow in COLO

Page 12: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

COLO Proxy Design

Scheme:

Kernel scheme: Based on kernel TCP/IP stack and netfilter component

Can support vhost-net, virtio, e1000, rtl8139, etc

Better performance but less flexible (Need modify

netfilter/iptables and kernel)

Userspace scheme: Totally realized in QEMU

Based on QEMU’s netfilter components and SLIRP component

Not support vhost-net, but e1000, rtl8139

More flexible

Page 13: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Proxy Design (Userspace scheme)

13

Filter mirror: copy and forward client’s packets to SVM

Filter redirector: redirect net packets

COLO compare: compare PVM’s and SVM’s net packets;

Filter rewriter: adjust tcp packets’ ack and tcp packets’ seq

Guest

Filter

mirror

Filter

redirector

COLO

compare

tap

Guest

Filter

redirector

Adjust

ack

Adjust

seq

Filter

redirector

If identical then release packets

Primary QEMU Secondary QEMU

tap

Client

tx rxall

Filter

rewriter

TCPFilter excute order

tx rx

Filter excute order

Page 14: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

COLO Performance In KVMPerformance (Based on kernel proxy)

14

90.07%

82.13%

77.83%

86.08%

92.98%

96.93%

86.51%89.10%

50.00%

55.00%

60.00%

65.00%

70.00%

75.00%

80.00%

85.00%

90.00%

95.00%

100.00%

WEBBENCH FTP GET FTP PUT NETPERF TCPNETPERF UDP SIEGE PGBENCH Kernel build

The experimental data is normalized to the native system

Page 15: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

Agenda

Introduce COarse-grain LOck-stepping

COLO Design and Technology Details

Current Status Of COLO In KVM

Further Work About COLO

Page 16: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

Status of COLO In KVM

16

COLO Framework:

Include VM state checkpoint process, failover process

Patch set v18 had been post

COLO block replication:

Only including non-shared storage replication scheme

Already been merged to branch

https://github.com/stefanha/qemu/commits/block-next

COLO proxy:

netfilter base/buffer/mirror plugins have been merged

Userspace packets compare is WIP and v11 version has been posted

Page 17: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

Agenda

Introduce COarse-grain LOck-stepping

COLO Design and Technology Details

Current Status Of COLO In KVM

Further Work About COLO

Page 18: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

HUAWEI TECHNOLOGIES CO., LTD.

TODO

18

Continuous VM replication development

Support shared storage

Optimize performance

Reduce VM’s downtime while do checkpoint

Improve storage and network performance

Implement the heartbeat component

Support COLO in libvirt

Page 19: Status Update About COLO FT - linux-kvm.org · HUAWEI TECHNOLOGIES CO., LTD.  Status Update About COLO FT Hailiang Zhang (Huawei) Randy Han (Huawei)

Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved.

The information in this document may contain predictive statements including, without limitation,

statements regarding the future financial and operating results, future product portfolio, new technology,

etc. There are a number of factors that could cause actual results and developments to differ materially

from those expressed or implied in the predictive statements. Therefore, such information is provided for

reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the

information at any time without notice.

Thank youwww.huawei.com