kvm 在 openstack 中的应用 - files.meetup.comfiles.meetup.com/10602292/kvm features.pdf ·...

27
KVM OpenStack 中的应用 Dexin(Mark) Wu

Upload: ngotuong

Post on 04-Jul-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

KVM在 OpenStack中的应用

Dexin(Mark) Wu

Agenda

● Overview● CPU● Memory● Storage● Network

Architecture Overviewnova-api

nova-scheduler

nova-conductor

nova-compute

libvirt driver

libvirt

qemu driver

Qemu/KVM

DBRPC Call

QMP Monitor

REST API

Storage disk driverdisk driver tapdisk driver

routerswitch

Network

Cinder Neutron

Agenda

● Overview● CPU● Memory● Storage● Network

KVM/Qemu Model

● Each vm is a qemu process

● Each vCPU is a qemu thread

● Reuse Kernel facilities

Cgroup

● Weight– quota:cpu_shares

– No hard limit

● Bandwidth Control– quota:cpu_period

– quota:cpu_quota

– Can't exceed 'quota' ms

in a period

Root

Gold Silver

3072 2048

A B C D

1024 1024 10241024

30% 30% 20% 20%

60% 40%

100%

CPU topology

Core 0

L1 I

cpu0

L2

L1 D

cpu1

L3

Core 1

L1 I

cpu2

L2

L1 D

cpu3

Core 2

L1 I

cpu4

L2

L1 D

cpu5

Core 3

L1 I

cpu6

L2

L1 D

cpu7

Socket 0

Memory Memory

Core 0

L1 I

cpu8

L2

L1 D

cpu9

L3

Core 1

L1 I

cpu10

L2

L1 D

cpu11

Core 2

L1 I

cpu12

L2

L1 D

cpu13

Core 3

L1 I

cpu14

L2

L1 D

cpu15

Socket 1

NUMA Node 0 NUMA Node 1

Local LocalRemote Remote

vCPU topology

● Benefit– Remove licensing restrictions

– Improve performance by working with vcpu pinning

● Implemented in Juno* hw:cpu_sockets=NN - preferred number of sockets to expose to the guest* hw:cpu_cores=NN - preferred number of cores to expose to the guest* hw:cpu_threads=NN - preferred number of threads to expose to the guest* hw:cpu_max_sockets=NN - maximum number of sockets to expose to the guest* hw:cpu_max_cores=NN - maximum number of cores to expose to the guest* hw:cpu_max_threads=NN - maximum number of threads to expose to the guest

vNUMA

● Benefit

– increase the effective utilization of compute resources● Implemented in Juno

– virt-driver-numa-placement.rst

* hw:numa_nodes=NN - num of NUMA nodes to expose to the guest.* hw:numa_mempolicy=preferred|strict - memory allocation policy* hw:numa_cpus.0=<cpu-list> - mapping of vCPUS N-M to NUMA node 0* hw:numa_cpus.1=<cpu-list> - mapping of vCPUS N-M to NUMA node 1* hw:numa_mem.0=<ram-size> - mapping N GB of RAM to NUMA node 0* hw:numa_mem.1=<ram-size> - mapping N GB of RAM to NUMA node 1

● Qemu and libvirt dependencies

-object memory-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0

Other Features

● vCPU Pinning– Approved in Kilo: virt-driver-cpu-pinning.rst

– Dedicated CPU

– Forbid overcommit of CPU

● vCPU hotplug– 'live-resize' proposed, but not approved yet.

– virsh command● setvcpus domain count –live

– Auto oneline new vcpu in guest● udev rule● Guest agent

Agenda

● Overview● CPU● Memory● Storage● Network

Physical memory virtualization

● Guest physical memory is mapped into qemu virtual address space

● Mapping is maintained in memory slots

● Qemu use malloc or mmap to allocate memory

● Reuse kernel memory feature

– Overcommit

– Hugepage

– KSM

Memory Hugepage

● Approved in Kilo: virt-driver-large-pages.rst● Benefit

– increase TLB hit ratio

– less page table footprint

● Why not THP?– No hard guarantees

Memory Balloon (1)

● Balloon device is added by default● Missing “Overcommit Manager”

● Memory Overcommit

DeflateQemu

InflateQemuQemu Qemu

Guest Guest Guest

Memory Balloon (2)

● Guest Memory Stats Query– More detailed and accurate

– Re-enabled by polling instead of asynchronous

– Not real time

– Nova support available in Juno● CONF.libvirt.mem_stats_period_seconds

– Ceilometer support available in Kilo

Qemu

Balloon Thread

MemoryStats

Polling

Client

fetch last update synchronously

Guest

Memory Hotplug

● Added in qemu 2.1● Libvirt support is under development● Qemu commands

● Auto online via udev

(qemu) object_add memory-backend-ram,id=ram1,size=1G(qemu) device_add pc-dimm,id=d1,memdev=ram1

SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online"

Agenda

● Overview● CPU● Memory● Storage● Network

Storage Architecture

● Frontend– IDE, SCSI, Virtio

● Image format– Raw, Qcow2, QED, VMDK

● Backend– File, host, ceph, glusterfs,

sheepdog, iscsi

Cache ModeCache Mode Host Page Cache Guest Disk Cache Semantics

none No Yes direct

directsync No No direct+ flush

writeback Yes Yes

writethrough Yes No writeback + flush

nosafe Yes Yes, but flush is ignored writeback “- flush”

direct = O_DIRECT flush = fdatasync or fsync● Configuration– disk_cachemodes=”file=directsync,block=none”

● Is writeback safe?– data lost on power failure

– data corruption● Guest FS barrier● Live migration

I/O throttling

● Why not cgroup?● Exposed by Cinder qos spec● Currently missing online update support● New version qemu re-implements throttling

based on leaky bucket– Support burst

● Missing cluster-level I/O throttling

Discard

● Return freed blocks to the storage● Two underlying specifications

– ATA TRIM Command

– SCSI UNMAP

● Nova configuration– hw_disk_discard=unmap

– Image metadata hw_scsi_model=virtio-scsi

● Issued from guest– fstrim,

– mount option '-o discard'

● Supported in file,qcow2,rbd,glusterfs,sheepdog,iscsi

Virtio SCSI

● vHBA● Improve scalability● Enable advanced SCSI

features● Recognized as 'sda', not vda● vhost-scsi

– Better performance

– No format driver support

– Disallow live migration

Other features

● Snapshot– quiesced-image-snapshots-with-qemu-guest-

agent.rst

● driver-mirror– storage live migration

● Multi-queue virtio-disk

Agenda

● Overview● CPU● Memory● Storage● Network

Network

● Vhost-net– Less context switch

– Zero-copy transmit

● Vhost-net + macvtap +sriov– Live migration

● Multi-queue virtio NIC– Scale performance with vCPU increase

● Vhost-user– Approved in Kilo

– Userspace equivalent of vhost-net

– Used with userspace switch

Reference● http://www.slideshare.net/meituan/kvmopt-osforce-27669119

● http://www.linux-kvm.org/wiki/images/7/7b/Kvm-forum-2013-openstack.pdf

● http://www.linux-kvm.org/wiki/images/f/f6/01x07a-Vhost.pdf

● http://www.virtualizemydc.ca/2014/01/26/understanding-vnuma-virtual-non-uniform-memory-access/

● http://www.searchtb.com/2012/12/%E7%8E%A9%E8%BD%ACcpu-topology.html

● http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/

● http://log.amitshah.net/wp-content/uploads/2014/11/virt-6-7-centos-dojo.pdf

Thanks!

Email: [email protected]