live block device operations in qemu - fosdem › 2018 › schedule › event › vai... ·...

54
Live Block Device Operations in QEMU Kashyap Chamarthy <[email protected]> FOSDEM 2018 Brussels 1 / 34

Upload: others

Post on 06-Jun-2020

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Live Block Device Operationsin QEMUKashyap Chamarthy <[email protected]>FOSDEM 2018Brussels

1 / 34

Page 2: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Part IBackground

2 / 34

Page 3: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

KVM / QEMU virtualization components

Linux: KVM – /dev/kvm

VM1QEMU

VM2QEMU

libvirtd

OpenStackCompute (Nova) KubeVirtlibguestfs

guestfish

(Virt Driver)

QMP QMP

3 / 34

Page 4: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

In this talk

libvirtd

QEMU

Host: Linux - VFS, FS, Block Layer

Hardware: physical disk

Block Layer

qcow2 raw ...

file NBD ...

QMP

Talk focusImage format

I/O protocol

4 / 34

Page 5: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU’s block subsystem

Emulated storage devices: SCSI, IDE, virtio-blk, . . .

$ qemu-system-x86_64 -device help

Block driver types:– Format: qcow2, raw, vmdk– I/O Protocol: NBD, file, RBD/Ceph

Block device operations:– Offline image manipulation: qemu-img, qemu-nbd– Live: snapshots, image streaming, storage migration, . . .

5 / 34

Page 6: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU’s block subsystem

Emulated storage devices: SCSI, IDE, virtio-blk, . . .

$ qemu-system-x86_64 -device help-device help

Block driver types:– Format: qcow2, raw, vmdk– I/O Protocol: NBD, file, RBD/Ceph

Block device operations:– Offline image manipulation: qemu-img, qemu-nbd– Live: snapshots, image streaming, storage migration, . . .

5 / 34

Look for "Storage devices:"

Page 7: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU Copy-On-Write overlays

base (raw) overlay (qcow2)

– Read from the overlay if allocated, otherwise from base– Write to overlay only

Use cases: Thin provisioning, snapshots, backups, . . .

Create a minimal backing chain:

$ qemu-img create -f raw base.raw 5G

$ qemu-img create -f qcow2 overlay.qcow2 2G \-b base.raw -F raw

↑ ↑(Backing file) (Backing file format)

6 / 34

Page 8: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU Copy-On-Write overlays

base (raw) overlay (qcow2)

– Read from the overlay if allocated, otherwise from base– Write to overlay only

Use cases: Thin provisioning, snapshots, backups, . . .

Create a minimal backing chain:

$ qemu-img create -f raw base.raw 5G

$ qemu-img create -f qcow2 overlay.qcow2 2G \-b base.raw-b base.raw -F raw-F raw

↑ ↑(Backing file) (Backing file format)

6 / 34

Page 9: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Accessing disk images opened by QEMU

base overlay1 overlay2

(Live QEMU)

Disk images that are opened by QEMU should notbe accessed by external tools– QEMU offers equivalent monitor commands

For safe, read-only access, use libguestfs:

$ guestfish –ro -i -a disk.img

7 / 34

Page 10: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Disk image locking

Prevents two concurrent writers to a disk image– Using Linux Open File Description (OFD) Locks

To query an in use disk image:

$ qemu-img info foo.qcow2 ––force-share

When launching QEMU (2.10+):

$ qemu-system-x86_64\-blockdev driver=qcow2,file.driver=file\file.filename=./foo.qcow2,file.locking=auto,\[...]

8 / 34

Page 11: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Disk image locking

Prevents two concurrent writers to a disk image– Using Linux Open File Description (OFD) Locks

To query an in use disk image:

$ qemu-img info foo.qcow2 ––force-share––force-share

When launching QEMU (2.10+):

$ qemu-system-x86_64\-blockdev driver=qcow2,file.driver=file\file.filename=./foo.qcow2,file.locking=auto,\[...]

8 / 34

Allows read-only access to an activedisk; may return stale data

Page 12: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Disk image locking

Prevents two concurrent writers to a disk image– Using Linux Open File Description (OFD) Locks

To query an in use disk image:

$ qemu-img info foo.qcow2 ––force-share

When launching QEMU (2.10+):

$ qemu-system-x86_64\-blockdev driver=qcow2,file.driver=file\file.filename=./foo.qcow2,file.locking=auto,\[...]

8 / 34

Page 13: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Disk image locking

Prevents two concurrent writers to a disk image– Using Linux Open File Description (OFD) Locks

To query an in use disk image:

$ qemu-img info foo.qcow2 ––force-share

When launching QEMU (2.10+):

$ qemu-system-x86_64\-blockdev driver=qcow2,file.driver=file\file.filename=./foo.qcow2,file.locking=autofile.locking=auto,\[...]

8 / 34

Defaults to OFD lockingon Linux 3.15+

Page 14: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Part IIPrimer on operating QEMU

9 / 34

Page 15: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU’s QMP monitor

Provides a JSON RPC interface– Send commands to query / modify VM state– QMP (asynchronous) events on certain state changes

If you zoom into libvirt-generated QEMU command-line:

/usr/bin/qemu-system-x86_64 [...]\-chardev socket,id=charmonitor,\path=[...]/monitor.sock,server,nowait\-mon chardev=charmonitor,id=monitor,mode=control

Shorthand for the above:

-qmp unix:./qmp-sock,server,nowait

10 / 34

Page 16: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU’s QMP monitor

Provides a JSON RPC interface– Send commands to query / modify VM state– QMP (asynchronous) events on certain state changes

If you zoom into libvirt-generated QEMU command-line:

/usr/bin/qemu-system-x86_64 [...]\-chardev socket,id=charmonitor,\-chardev socket,id=charmonitor,\path=[...]/monitor.sock,server,nowait\path=[...]/monitor.sock,server,nowait\-mon chardev=charmonitor,id=monitor,mode=control-mon chardev=charmonitor,id=monitor,mode=control

Shorthand for the above:

-qmp unix:./qmp-sock,server,nowait

10 / 34

UNIX stream socket setup forlibvirt ←→ QEMUcommunication

Page 17: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU’s QMP monitor

Provides a JSON RPC interface– Send commands to query / modify VM state– QMP (asynchronous) events on certain state changes

If you zoom into libvirt-generated QEMU command-line:

/usr/bin/qemu-system-x86_64 [...]\-chardev socket,id=charmonitor,\path=[...]/monitor.sock,server,nowait\-mon chardev=charmonitor,id=monitor,mode=control

Shorthand for the above:

-qmp unix:./qmp-sock,server,nowait

10 / 34

Page 18: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Interacting with the QMP monitor

$ socat UNIX:./qmp-sock \READLINE,history=$HOME/.qmp_history \

{"QMP": {"version":{"qemu": {"micro": 50, "minor": 11, "major": 2},"package": "(v2.11.0-355-g281f327487)"},"capabilities": []}}

{"execute": "qmp_capabilities"}{"return": {}}

{"execute": "query-status"}{"return": {"status": "running", "singlestep": false,

"running": true} }

Send arbitrary commands: query-kvm, blockdev-backup, ...

11 / 34

Page 19: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Interacting with the QMP monitor

$ socat UNIX:./qmp-sock \READLINE,history=$HOME/.qmp_history \

{"QMP": {"version":{"QMP": {"version":{"qemu": {"micro": 50, "minor": 11, "major": 2},{"qemu": {"micro": 50, "minor": 11, "major": 2},"package": "(v2.11.0-355-g281f327487)"},"package": "(v2.11.0-355-g281f327487)"},"capabilities": []}}"capabilities": []}}

{"execute": "qmp_capabilities"}{"return": {}}

{"execute": "query-status"}{"return": {"status": "running", "singlestep": false,

"running": true} }

Send arbitrary commands: query-kvm, blockdev-backup, ...

11 / 34

Indicates successfulconnection

Page 20: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Interacting with the QMP monitor

$ socat UNIX:./qmp-sock \READLINE,history=$HOME/.qmp_history \

{"QMP": {"version":{"qemu": {"micro": 50, "minor": 11, "major": 2},"package": "(v2.11.0-355-g281f327487)"},"capabilities": []}}

{"execute": "qmp_capabilitiesqmp_capabilities"}{"return": {}}

{"execute": "query-status"}{"return": {"status": "running", "singlestep": false,

"running": true} }

Send arbitrary commands: query-kvm, blockdev-backup, ...

11 / 34

Prerequisite

Page 21: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Interacting with the QMP monitor

$ socat UNIX:./qmp-sock \READLINE,history=$HOME/.qmp_history \

{"QMP": {"version":{"qemu": {"micro": 50, "minor": 11, "major": 2},"package": "(v2.11.0-355-g281f327487)"},"capabilities": []}}

{"execute": "qmp_capabilities"}{"return": {}}

{"execute": "query-statusquery-status"}{"return": {"status": "running", "singlestep": false,

"running": true} }

Send arbitrary commands: query-kvm, blockdev-backup, ...

11 / 34

Issue regular QMP commands

Page 22: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Interacting with the QMP monitor

$ socat UNIX:./qmp-sock \READLINE,history=$HOME/.qmp_history \

{"QMP": {"version":{"qemu": {"micro": 50, "minor": 11, "major": 2},"package": "(v2.11.0-355-g281f327487)"},"capabilities": []}}

{"execute": "qmp_capabilities"}{"return": {}}

{"execute": "query-status"}{"return": {"status": "running", "singlestep": false,

"running": true} }

Send arbitrary commands: query-kvm, blockdev-backup, ...

11 / 34

Invoking JSON manually is no fun —thankfully, libvirt automates it all

Page 23: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Other ways to interact with QMP monitor

qmp-shell: A low-level tool located in QEMU source.Takes key-value pairs (& JSON dicts):

$ qmp-shell -v -p ./qmp-sock(QEMU) block-job-complete device=virtio1

virsh: libvirt’s shell interface

$ virsh qemu-monitor-command \vm1 ––pretty '{"execute":"query-kvm"}'

NB: Modifying VM state behind libvirt’s back voids support warranty!

Useful for test / development

12 / 34

Page 24: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Part IIIConfiguring block devices

13 / 34

Page 25: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Aspects of a QEMU block device

QEMU block devices have a notion of a:

Frontend — guest-visible devices (IDE, SCSI, virtio-blk, ...)

-device: command-linedevice_add: run-time; like any other type of guest device

Backend — block drivers (NBD, qcow2, raw, ...)

-drive (legacy) / -blockdev: command-lineblockdev-add: run-time

14 / 34

Page 26: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Configure on command-line: -blockdev

Provides fine-grained control over configuring block devices

Since QEMU 2.9; successor to -drive option

E.g. Attach a qcow2 disk to a virtio-blk guest device:

$ qemu-system-x86_64 [...]\-blockdev node-name=node1,driver=qcow2,\file.driver=file,file.filename=./base.qcow2\-device virtio-blk,drive=node-Base

→ More details: Talks from previous KVM Forums

15 / 34

Page 27: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Configure on command-line: -blockdev

Provides fine-grained control over configuring block devices

Since QEMU 2.9; successor to -drive option

E.g. Attach a qcow2 disk to a virtio-blk guest device:

$ qemu-system-x86_64 [...]\-blockdev node-name=node1,driver=qcow2,\-blockdev node-name=node1,driver=qcow2,\file.driver=file,file.filename=./base.qcow2\file.driver=file,file.filename=./base.qcow2\-device virtio-blk,drive=node-Base

→ More details: Talks from previous KVM Forums

15 / 34

Configures the ‘backend’

Page 28: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Configure on command-line: -blockdev

Provides fine-grained control over configuring block devices

Since QEMU 2.9; successor to -drive option

E.g. Attach a qcow2 disk to a virtio-blk guest device:

$ qemu-system-x86_64 [...]\-blockdev node-name=node1,driver=qcow2,\file.driver=file,file.filename=./base.qcow2\-device virtio-blk,drive=node-Base-device virtio-blk,drive=node-Base

→ More details: Talks from previous KVM Forums

15 / 34

Configures the ‘frontend’(guest-visible device)

Page 29: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Configure on command-line: -blockdev

Provides fine-grained control over configuring block devices

Since QEMU 2.9; successor to -drive option

E.g. Attach a qcow2 disk to a virtio-blk guest device:

$ qemu-system-x86_64 [...]\-blockdev node-name=node1,driver=qcow2,\file.driver=file,file.filename=./base.qcow2\-device virtio-blk,drive=node-Base

→ More details: Talks from previous KVM Forums

15 / 34

Page 30: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Configure at run-time: QMP blockdev-add

Add a qcow2 block device at run-time:

{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "node-A","file": {"driver": "file",

"filename": "disk-A.qcow2"}}}

Command-line is 1:1 mapping of JSON (from above):

-blockdev driver=qcow2,node-name=node-A,\file.driver=file,file.filename=./disk-A.qcow2

→ Here too, refer to previous KVM Forum talks

16 / 34

Page 31: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Part IVLive block operations

17 / 34

Page 32: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

blockdev-snapshot-sync: External snapshots

When invoked while the guest is running:

1. the existing disk becomes the backing file; and2. a QCOW2 overlay is created to track new writes

Base image can be of any format; overlays must be QCOW2

Allows atomic live snapshot of multiple disks

No guest downtime — snapshot creation is instantaneous

18 / 34

Page 33: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

blockdev-snapshot-sync: Example

base

(Live QEMU)

Create an external snapshot (qmp-shell invocation):

blockdev-snapshot-sync node-name=node-Base \snapshot-file=overlay1.qcow2 \snapshot-node-name=node-Overlay1

Result:

base overlay1

(Live QEMU)

19 / 34

Page 34: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

blockdev-snapshot-sync: Example

base

(Live QEMU)

Create an external snapshot (qmp-shell invocation):

blockdev-snapshot-sync node-name=node-Base \blockdev-snapshot-sync node-name=node-Base \snapshot-file=overlay1.qcow2 \snapshot-file=overlay1.qcow2 \snapshot-node-name=node-Overlay1snapshot-node-name=node-Overlay1

Result:

base overlay1

(Live QEMU)

19 / 34

libvirt equivalent:$ virsh snapshot-create-as vm1 ––disk-only

Page 35: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

blockdev-snapshot-sync: Example

base

(Live QEMU)

Create an external snapshot (qmp-shell invocation):

blockdev-snapshot-sync node-name=node-Base \snapshot-file=overlay1.qcow2 \snapshot-node-name=node-Overlay1

Result:

base overlay1

(Live QEMU)

19 / 34

Page 36: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

But...long image chains can get cumbersome

base overlay1 overlay2 overlay3

(Live QEMU)

Problems:- Revert to external snapshot is non-trivial- Multiple files to track- I/O penalty with a long disk image chain

There are some solutions. . .

20 / 34

Page 37: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (1)

base overlay1 overlay2 overlay3

(Live QEMU)

Problem: Shorten the chain of overlays

Simplest case: Merge all of them into ’base’

base

overlay1 overlay2 overlay3

21 / 34

Page 38: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (2)

base

overlay1 overlay2 overlay3

Run-time invocation (using qmp-shell):

blockdev-snapshot-sync [...]block-commit device=node-Overlay3 job-id=jobAblock-job-complete device=jobA

22 / 34

Page 39: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (2)

base

overlay1 overlay2 overlay3

Run-time invocation (using qmp-shell):

blockdev-snapshot-sync [...]blockdev-snapshot-sync [...]block-commit device=node-Overlay3 job-id=jobAblock-job-complete device=jobA

22 / 34

Invoke it thrice – to create 3 overlays

Page 40: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (2)

base

overlay1 overlay2 overlay3

Run-time invocation (using qmp-shell):

blockdev-snapshot-sync [...]block-commit device=node-Overlay3 job-id=jobAblock-commit device=node-Overlay3 job-id=jobAblock-job-complete device=jobA

22 / 34

Copy content from all the overlays into base

Page 41: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (2)

base

overlay1 overlay2 overlay3

Run-time invocation (using qmp-shell):

blockdev-snapshot-sync [...]block-commit device=node-Overlay3 job-id=jobAblock-job-complete device=jobAblock-job-complete device=jobA

22 / 34

Gracefully end the ‘commit’ job, and pivot QEMU

Page 42: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (2)

base

overlay1 overlay2 overlay3

Run-time invocation (using qmp-shell):

blockdev-snapshot-sync [...]blockdev-snapshot-sync [...]block-commit device=node-Overlay3 job-id=jobAblock-commit device=node-Overlay3 job-id=jobAblock-job-complete device=jobAblock-job-complete device=jobA

22 / 34

libvirt equivalent:$ virsh blockcommit vm1 vda ––pivot

Page 43: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

commit: Live merge a disk image chain (3)

base

overlay1 overlay2 overlay3

Two phase (sync + pivot) operation = a consolidated disk:

base overlay1 overlay2 overlay3

(Live QEMU)

(invalid) (invalid) (invalid)

23 / 34

Page 44: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

stream: Copy from backing files to overlays

A bit similar to ‘commit’, but in the opposite direction

Specifics:

‘stream’ operation is safe – data is being pulled forwardIntermediate overlays remain valid – unlike ‘commit’

Intermediate image streaming (from QEMU 2.8+)

24 / 34

Page 45: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

mirror: Synchronize active disk to another image

base overlay1 overlay2 overlay3

copy

(Live QEMU)

Synchronization modes:‘full’ – copy the entire chain‘top’ – only from the topmost (active) image‘none’ – copy only new writes from now on

25 / 34

Page 46: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

mirror: Operation

base overlay1 overlay2 overlay3

copy

(Live QEMU)

drive-mirror [...] target=copy1.qcow2 sync=fullquery-block-jobsblock-job-complete device=virtio0

26 / 34

Page 47: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

QEMU NBD server

Network Block Device server — built into QEMU– Lets you export images while in use

QMP commands:

nbd-server-start addr={"type":"unix","data":{"path":"./nbd-sock"}}}

nbd-server-add device=TargetDisk writable=true

nbd-server-stop

→ External program for offline use: qemu-nbd

27 / 34

Page 48: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Combining ‘mirror’ and NBD

Use case: Live VM migration with non-shared storage

1. Destination QEMU sets up the NBD server

2. Source QEMU issues drive-mirror to synchronize disk(s):

{"execute": "drive-mirror",\"arguments":{

"device": "disk0",\"target": "nbd:dest:49153:exportname=disk0",\"sync": "top", "mode": "existing"

}

Details: qemu/docs/interop/live-block-operations.rst

28 / 34

Page 49: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

mirror + NBD: libvirt automation

NBD-based live storage migration as done by libvirt:

$ virsh migrate \–-live \––verbose \––p2p \––copy-storage-all \vm1 \qemu+ssh://root@desthost/system

Higher layers, such as OpenStack, use the equivalentlibvirt Python APIs: migrateToURI[2,3]()

29 / 34

Page 50: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

backup: Point-in-time copy of a block device

Point-in-time:– For backup: when you start the operation– For mirror: when you end the sync

(via block-job-complete)

Synchronization modes for backup:– ‘top’, ‘full’, ‘none’– ‘incremental’

↖(For incremental backups; WIP as of 2.12)

qemu/docs/interop/bitmaps.rst

30 / 34

Page 51: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Combining ‘backup’ and NBD

Use case: Examine guest I/O patterns

base overlay1 overlay2 overlay3

copy

(sync=none)

(Live QEMU)

{"execute":"blockdev-backup","arguments":{"device": "node-Overlay3", "target": "copy",

"job-id": "job0", "sync": "none"}}[...] # Start NBD server and export the ‘copy’

31 / 34

Page 52: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Summary

commit: Move data from overlays into backing files

stream: Move data from backing files into overlays

mirror: Live VM migration with non-shared storage

backup: Point-in-time copy

qemu/docs/interop/live-block-operations.rst qemu/docs/interop/bitmaps.rst

32 / 34

Page 53: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

References

Detailed docs with examples: Live Block Device Operationshttps://kashyapc.fedorapeople.org/QEMU-Docs/_build/html/docs/live-block-operations.html

"Incremental Backups - Good things come in small packages!" by John Snowhttps://fosdem.org/2017/schedule/event/backup_dr_incr_backups/

"Managing the New Block Layer" by Kevin Wolf & Max Reitzhttps://events.static.linuxfound.org/sites/events/files/slides/talk_11.pdf

"Backing Chain Management in libvirt and qemu" by Eric Blakehttp://events.linuxfoundation.org/sites/events/files/slides/2015-qcow2-expanded.pdf

"Towards a more expressive and introspectable QEMU command line"https://events.static.linuxfound.org/sites/events/files/slides/armbru-qapi-cmdline_1.pdf

33 / 34

Page 54: Live Block Device Operations in QEMU - FOSDEM › 2018 › schedule › event › vai... · KVM/QEMUvirtualizationcomponents Linux: KVM–/dev/kvm VM1 QEMU VM2 QEMU libvirtd OpenStack

Questions?

34 / 34