openstack liberty summit 2015 vancouver - 01.orgdive into vm live migration openstack liberty summit...

45
Dive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Upload: others

Post on 22-Mar-2020

9 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Dive Into VM Live MigrationOpenStack Liberty Summit 2015

Vancouver

Michał Dulko

Michał Jastrzębski

Paweł Koniszewski

Page 2: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Why Bother?

Page 3: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Imminent host failure

o Maintenance mode

o Optimal resource placement

Use Cases

Page 4: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Cooling issues

o Storage problems

o Networking problems

o Your datacenter was struck by a flood

Imminent Host Failure

Page 5: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Firmware upgrades

o Hardware upgrades

o Kernel upgrades

Maintenance Mode

Page 6: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Reduce costs

o Move VMs closer to their storage to lessen network latency

o Stack more VMs on hosts to save power

o Increase resiliency

o Noisy neighbour separation

o Spread VMs across more hosts

Optimal Resource Placement

Page 7: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

General Flow

Page 8: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Live

o Consistent

o Transparent

o Minimal service disruption

Assumptions

Page 9: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Non-live migration (cold migration)o nova migrate <server>

True live migration (shared storage or volume-based)o nova live-migration <server> [<host>]

Block live migrationo nova live-migration --block-migrate <server> [<host>]

Migrations in OpenStack

Page 10: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Migration type Local storage Volumes Shared storage

Block LM ✓ ✗ ✗

True LM ✗ ✓ ✓

Block LM with read-only devices

✗ ✗ ✗

True LM with read-only devices

✗ ✗ ✓

Compatibility

Page 11: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Pre-Migration

o Reservation

o Iterative pre-copy

o Stop and copy

o Commitment

Live Migration Process

Page 12: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Active VM on physical host A, host B selected by scheduler or preselected.

Pre-migration

Compute node A Compute node B

VM AActive

Pre-migration

Reservation

Iterative pre-copy

Stop and copy

Commitment

Page 13: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Confirm availability of resources on host B; reserve a new VM.

Reservation

Compute node A Compute node B

VM

A

VM

A

ACTIV

E

Compute node A Compute node B

VM AReserved

VM AActive

Pre-migration

Reservation

Iterative pre-copy

Stop and copy

Commitment

Page 14: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Memory is transferred from A to B and next dirtied pages are iteratively copied.

Iterative pre-copy

Pre-migration

Reservation

Iterative pre-copy

Stop and copy

Commitment

Compute node A Compute node B

VM APaused

VM AActive

Copying

Page 15: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Suspend VM and copy remaining pages and CPU state.

Stop and copy

Compute node A Compute node B

VM

A

VM

A

PAUSE

D

PAUSE

D

Pre-migration

Reservation

Iterative pre-copy

Stop and copy

Commitment

Compute node A Compute node B

VM APaused

VM APaused

Copy

Page 16: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Host B becomes primary host for VM A.

Commitment

Pre-migration

Reservation

Iterative pre-copy

Stop and copy

Commitment

Compute node A Compute node B

VM

A

VM

A

PAUSE

D

PAUSE

D

Compute node A Compute node B

VM AActive

Page 17: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Performance & reliability

Page 18: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o OpenStack does not allow triggering any operations on VM during LM

o VMs with intensive memory workload are hard to migrate

o LM generates heavy load on network

o Migrations between CN with different CPUs

o Memory oversubscription

Pitfalls

Page 19: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o OpenStack disallow any operation on ongoing LM

o You can use virsh instead to interact

Interacting With Live Migration

Page 20: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Information about ongoing LM

virsh domjobinfo <domain>

Diagnosis

Time elapsed 1918595 ms

Data processed 410.137 GiB

Data remaining 4.600 GiB

Data total 16.008 GiB

Constant pages 144658

Normal pages 107307605

Normal data 409.346 GiB

Expected downtime 1023 ms

Page 21: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Cancel on-going LM

virsh domjobabort <domain>

o Pause VM during LM

virsh suspend <domain>

Forcing Migration Finish

Page 22: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o QEMU

virsh qemu-monitor-command --hmp <domain> migrate_set_downtime <time (sec)>

o libvirtvirsh migrate-setmaxdowntime <domain> <time (sec)>

Tuning Maximum Downtime

Page 23: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o nova.conf setting

live_migration_flag += VIR_MIGRATE_AUTO_CONVERGE

Auto Converge

Page 24: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o nova.conf setting

live_migration_flag += VIR_MIGRATE_TUNNELLED

Tunneled Migration

Hypervisor

libvirt

Hypervisor

libvirt

Source Host Destination Host

Page 25: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o nova.conf setting

live_migration_flag -= VIR_MIGRATE_TUNNELLED

Tunneled Migration

Hypervisor

libvirt

Hypervisor

libvirt

Source Host Destination Host

Page 26: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o libvirt

virsh migrate-setspeed <domain> <speed (MiB/s)>

o nova.conf settings

live_migration_bandwidth = <speed (MiB/s)>

Tuning Bandwidth

Page 27: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o nova.conf settings

live_migration_flag += VIR_MIGRATE_COMPRESSED

XBZRLE Compression

Sent Page Cache

Updated Page

Delta Compression

Delta

Received Pages

Delta

ApplyDelta

Updated Page

Source Host Destination Host

Page 28: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o nova.conf

o live_migration_uri = qemu+tcp://%s/system

LM On Dedicated Network

Compute node A Compute node BManagement Network

VM AActive

VM APaused

Page 29: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o nova.conf

o live_migration_uri = qemu+tcp://%s-lm/system

o Set up your DNS to resolve hostnames with -lm suffix to IPs in your dedicated network.

LM On Dedicated Network

Compute node A Compute node BManagement Network

VM AActive

LM Network VM APaused

Page 30: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o CPU instruction set of source node needs to be a subset of CPU instruction set of destination node

Different CPUs Between Compute Nodes

Compute Node A Compute Node B

AVX

SSE2MMX

AVX

MMX

Passed

Live Migration

Failed

Page 31: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o This can be skipped by explicitly setting VM CPU model in nova.conf:

o cpu_mode = custom

o virt_type = kvm or virt_type = qemu

o And then you can set cpu_model

o List of supported named CPUs is in libvirt/cpu_map.xml

Different CPUs Between Compute Nodes

Page 32: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o LM to specific host does not use memory oversubscriptiono ram_allocation_ratio

Memory Oversubscription

Compute Node A2 GB RAM

Reported RAM = available - reserved

nova-conductor

2 GB

2 GB

2 GB

4 GB

nova-scheduler

ram_allocation_ratio = 2.0

Page 33: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Skip it byo reserved_host_memory_mb=-2048

Memory Oversubscription

Compute Node A2 GB RAM

Reported RAM = available - reserved

nova-conductor

4 GB

4 GB

4 GB

4 GB

nova-scheduler

ram_allocation_ratio = 1.0

Page 34: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Secure Live Migration

Page 35: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Everything can be sniffed!

o Migrated machines can contain sensitive data

o Legal issues with unencrypted data transfer

Why Security Matters?

Page 36: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Hypervisor native encryption

o QEMU doesn’t support it

o libvirt tunneled transport

o live_migration_uri = qemu+ssh://%s/system

o live_migration_flag += VIR_MIGRATE_TUNNELLED

o Uses only one core

o IPSec tunnel between hosts

Encryption

Page 37: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

0

0.5

1

1.5

2

2.5

3

QEMU+SSH QEMU+TCP

Tra

nsf

er

rate

[G

Bp

s]

Intel(R) Xeon(R) CPU E5-2690 v2

Intel(R) Xeon(R) CPU E5-2660 v3

Memory Access Is Critical

Page 38: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Future Of Live Migration

Page 39: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Compress every page sent during LM

o zlib used for compression

o Configurable:

o Number of threads

o Comperession ratio

Multithreaded Compression

Page 40: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Move workload immediately to destination host

Post-copy Live Migration

Compute node A Compute node B

VM AActive

VM APaused

Copying

Page 41: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Cheap solution to finish live migration in a finite time

o VM needs to be rebooted in case of failure

o Heavy performance impact

Post-copy Live Migration

Page 42: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Track memory transfer progress

o Detect possible problems and take actions

Active LM Monitoring In OpenStack

Page 43: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

o Pause VM

o Abort LM

o See progress

o Change configuration on the fly:

o Maximum tolerable VM down time

o Transfer bandwith

Actions On Ongoing Live Migration

Page 45: OpenStack Liberty Summit 2015 Vancouver - 01.orgDive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski

Q&A (& disclaimers)Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

© 2015 Intel Corporation.