integrating kvm with linux · 8/10/2010  · suspend/resume preempt notifiers mmu notifiers events...

24
Integrating KVM with Linux Avi Kivity August 10, 2010

Upload: others

Post on 26-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Integrating KVM with Linux

Avi KivityAugust 10, 2010

Page 2: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential2

Agenda

To Integrate or Not To Integrate?

Issues

Suspend/resume

Preempt notifiers

MMU notifiers

Events

User return notifiers

Lazy FPU

Conclusions

Page 3: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential3

Advantages and Disdvantages

● Historically, zero impact made the merge into Linux 2.6.20 quick and painless

● Special hardware has special needs

● Performance and functionality

● kvm-kmod

Page 4: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential4

Issues

● Blend in with the rest of the infrastructure

● Zero impact when not configured

● Minor impact when configured and unused

● Find more users

● Coordinating multiple staging trees

Page 5: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential5

Suspend/resume, cpu hotplug

● Linux does power management for us

● But... VMX has on-core registers that Linux doesn't know about!

Page 6: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential6

DEAD

Death or a processor

UP_PREPARE UP_CANCELED

ONLINE

DOWN_PREPARE DOWN_FAILED

Page 7: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential7

DYING

Death or a processor

UP_PREPARE UP_CANCELED

ONLINE

DOWN_PREPARE DOWN_FAILED

DEAD

Page 8: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential8

Suspend/resume, cpu hotplug

● Merged 2.6.23

Page 9: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential9

Preempt notifiers

● Lightweight exits: host kernel runs with some guest state loaded

● VMPTR● FPU● SYSCALL MSRs

● Task switch will clobber these register

● Initial solution: preemption disabled● Low QoS● Cannot allocate/swap/etc

Page 10: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential10

Control flow

Native GuestExecution

Kernelexit handler

Userspaceexit handler

Switch toGuest Mode

ioctl()

Userspace Kernel Guest

No preempt

Page 11: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential11

Preempt notifiers

● Task can ask schedulers for callbacks● sched_out() - save guest state, load host state● sched_in() - load guest state

● Kernel code no runs with preemption enabled

● Allocation, page-in allowed

● Merged in 2.6.23

Page 12: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential12

Preempt notifiers – additional users

● Pending work for concurrency managed work queues

● New work threads spawned when a work thread blocks

● Can also be used for modular perf events

Page 13: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential13

Control flow

Native GuestExecution

Kernelexit handler

Userspaceexit handler

Switch toGuest Mode

ioctl()

Userspace Kernel Guest

Page 14: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential14

MMU notifiers

● The K in KVM...

● Two way synchronization between Linux page tables and KVM shadow page tables

● Linux updates a PTE (page out, recency scan, COW), then updates KVM

● Hardware updates KVM shadow PTE, transferred to Linux page tables

● Also used for SGI's GRU, XPMEM

● May also be used for PCI ATS – device assignment

● Merged in 2.6.27

Page 15: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential15

Events

● Native I/O model is synchronous● Guest issues I/O instruction● KVM emulates● Exits to userspace for fulfillment● Guest is blocked while userspace processes I/O

● Good model for lightweight processing● No context switch overhead● Cache hot

● Interrupts asynchronous, but driven from userspace

Page 16: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential16

Events - problems

● Want to process events in kernel, not userspace

● Want to inject interrupts from kernel, not userspace

● Want asynchronous exits for heavyweight processing

● Do not want a KVM specific interface● Generic interface = more users = less bugs

Page 17: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential17

Events - eventfd

● Generalized kernel mechanism for signalling events

● Based on file descriptor, so can pass around

● Either a kernel task or a user task may signal...

● ... and either a kernel task or user task may wait for ...

● ... using any of the wait APIs

Page 18: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential18

Control flow - events

Native GuestExecution

Kernelexit handler

Userspaceexit handler

Switch toGuest Mode

ioctl()

Userspace Kernel Guest

Inject

Wake

Page 19: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential19

Events

● Users● Vhost family – in-kernel virtio● Shared memory – guest-to-guest wakeups● Device assignment with vfio

● Needed changes to eventfd

● Merged in 2.6.32

Page 20: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential20

User return notifiers

● Part of guest/host state is syscall/sysenter MSRs

● Save/restore on preempt notifiers, before exit to userspace

● In host, only used when returning to userspace or entering kernel

● Task switch to kernel thread triggers unneeded save/restore

● Task switch to guest with same values trigger unneeded save/restore

● Make it even lazier!

Page 21: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential21

User return notifiers - implementation

● Invoke callback just before return to userspace● Only if guest state is loaded● Save guest state, load host state

● Issue: real hot path (syscall)

● Piggyback on signal check● Zero impact on not-taken path – just a mask change

● Merged 2.6.33

Page 22: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential22

Lazy FPU

● FPU saved/restored on switch to other thread

● Or return to userspace (which may not touch FPU)

● Extend kernel lazy FPU to support KVM● Problem: kernel lazy FPU is only lazy in one direction● When switching out of a task, FPU is eagerly saved● Problem: FPU code is very old● Introduce FPU API● Problem: really lazy FPU requires an IPI to save state● May be slower!

● Work in progress

Page 23: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential23

Conclusions

● Integrating Linux and KVM key to success

● Must be done in a way the benefits, or at least does not harm, core kernel

● lkml sometimes less friendly than kvm@vger

● Need persistence and care

Page 24: Integrating KVM with Linux · 8/10/2010  · Suspend/resume Preempt notifiers MMU notifiers Events User return notifiers Lazy FPU Conclusions. 3 Avi Kivity / Red Hat Confidential

Avi Kivity / Red Hat Confidential24

Questions

?