keith adams, ole agesen 1st october 2009 presented by chwa hoon sung, kang joon young a comparison...

40
Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

Upload: reynold-logan

Post on 11-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

Keith Adams, Ole Agesen

1st October 2009Presented by Chwa Hoon Sung, Kang Joon Young

A Comparison of Software and Hardware Techniques for x86 Vir-

tualization

Page 2: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

210/1/2009

Virtualization

virtualization

Page 3: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3

Classic Virtualization

Software Virtualization

Hardware Virtualization

Comparison and Results

Discussion

Outline

10/1/2009

Page 4: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

410/1/2009

De-Privilege OS Executes guest operating systems directly but at lesser

privilege level, user-level

Classic Virtualization(Trap-and-Emulate)

OS

apps

kernel mode

user mode

Page 5: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

10/1/2009

De-Privilege OS Executes guest operating systems directly but at lesser

privilege level, user-level

Classic Virtualization(Trap-and-Emulate)

OS

apps

kernel mode

user mode

virtual machine monitor

OS

apps

Page 6: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

610/1/2009

Runs guest operating system deprivileged.

All privileged instructions trap into VMM.

VMM emulates instructions against virtual state.

Resumes direct execution from next guest instruction.

Trap-and-Emulate

Page 7: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

710/1/2009

Architectural Obstacles Traps are expensive. (~3000 cycles) Many traps unavoidable. (e.g., page faults) Not all architectures support the trap-and-emulate.

(x86)

Classic Virtualization (Cont’d)

Page 8: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

10/1/20098

Classic Virtualization(Popek & Goldberg)

System Virtualization

Trap-and-emulate

Software VMM Hardware VMM

Enhancement

Para-virtualization

(Xen)

Hardware Support for Virtualization

(Intel VT & AMD SVM)

Full-virtualization(VMware)

Page 9: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

9

Classic Virtualization

Software Virtualization

Hardware Virtualization

Comparison and Results

Discussion

Outline

10/1/2009

Page 10: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1010/1/2009

Until recently, the x86 architecture has not permitted classical trap-and-emulate virtualization. Some privileged state is visible in user mode

Guest OS can observe that current privilege level (CPL) in code segment selection (%cs).

Not all privileged operations trap when run in user mode

Dual-purpose instructions don’t trap (popf).

Software VMMs for x86 have instead used binary translation of the guest code.

Software Virtualization

Page 11: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1110/1/2009

Translates the kernel code to replace privileged instruc-tions with new sequences of instructions that have the intended effect on the virtual hardware.

The software VMM uses a translator with these proper-ties. Binary – input is machine-level code. Dynamic – occurs at runtime. On demand – code translated when needed for execution. System level – makes no assumption about guest code. Subsetting– translates from full instruction set to safe sub-

set. Adaptive – adjust code based on guest behavior to achieve

efficiency.

Binary Translation

Page 12: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1210/1/2009

The translators input is full x86 instruction set, includ-ing all the privileged instructions; output is a safe subset of user-mode instructions

Binary Translation (Cont’d)

Page 13: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1310/1/2009

Binary Translation

Translator

Guest Code

Translation

CacheCalloutsTC

Index

CPU EmulationRoutines

Page 14: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1410/1/2009

vPC mov ebx, eax

cli

and ebx, ~0xfff

mov ebx, cr3

sti

ret

Guest Code

Straight-line code

Control flow

Basic Block

Page 15: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1510/1/2009

vPC mov ebx, eax

cli

and ebx, ~0xfff

mov ebx, cr3

sti

ret

mov ebx, eax

call HANDLE_CLI

and ebx, ~0xfff

mov [CO_ARG], ebx

call HANDLE_CR3

call HANDLE_STI

jmp HANDLE_RET

start

Guest Code Translation Cache

Page 16: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1610/1/2009

vPC mov ebx, eax

cli

and ebx, ~0xfff

mov ebx, cr3

sti

ret

mov ebx, eax

mov [CPU_IE], 0

and ebx, ~0xfff

mov [CO_ARG], ebx

call HANDLE_CR3

mov [CPU_IE], 1

test [CPU_IRQ], 1

jne

call HANDLE_INTS

jmp HANDLE_RET

start

Guest Code Translation Cache

Page 17: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1710/1/2009

Avoid privilege instruction traps Example: rdtsc (read time-stamp counter) <- privileged

instruction Trap-and-emulate: 2030 cycles Callout-and-emulate: 1254 cycles (not TC) In TC emulation: 216 cycles

Performance Advantages of BT

Page 18: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

18

Classic Virtualization

Software Virtualization

Hardware Virtualization

Comparison and Results

Discussion

Outline

10/1/2009

Page 19: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

1910/1/2009

Recent x86 extension 1998 – 2005: Software-only VMMs using binary translation 2005: Intel and AMD start extending x86 to support virtu-

alization. First-generation hardware

Allows classical trap-and-emulate VMMs. Intel VT (Virtualization Technology) AMD SVM (Security Virtual Machine)

Performance VT/SVM help avoid BT, but not MMU ops. (actually

slower!) Main problem is efficient virtualization of MMU and I/O, Not

executing the virtual instruction stream.

Hardware Virtualization

Page 20: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2010/1/2009

VMCB(Virtual Machine Control Block) in-memory data structure Contains the state of guest virtual

CPU. Modes

Non-root mode: guest OS runs at its intended privilege level(ring 0) (Not fully privileged)

Root mode: VMM is running at a new ring with an even higher privi-lege level(Fully privileged)

Instructions vmrun: transfers from root to non-

root mode. exit: transfers from non-root to

root mode.

New Hardware Features

Page 21: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2110/1/2009

Intel VT-x Operations

Ring 0VMX RootMode

VMX Non-rootMode

. . .Ring 0

Ring 3

VM 1

Ring 0

Ring 3

VM 2

Ring 0

Ring 3

VM n

VMLAUNCHVM Run

VM Exit VMCB2

VMCBn

VMCB1

Page 22: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2210/1/2009

Hardware VMM reduces guest OS dependency Eliminates need for binary translation Facilitates support for Legacy OS

Hardware VMM improves robustness Eliminates need for complex SW techniques Simpler and smaller VMMs

Hardware VMM improves performance Fewer unwanted (Guest VMM) transitions

Benefits of Hardware Virtu-alization

Page 23: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

23

Classic Virtualization

Software Virtualization

Hardware Virtualization

Comparison and Results

Discussion

Outline

10/1/2009

Page 24: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2410/1/2009

BT tends to win in these areas: Trap elimination – BT can replace most traps with faster callouts.

Emulation Speed – callouts jump to predecoded emula-tion routine.

Callout avoidance – for frequent cases, BT may use in-TC emulation routines, avoiding even the callout cost.

The hardware VMM wins in these area: Code Density – since there is no translation. Precise exceptions – BT performs extra work to recover

guest state for faults. System calls – runs without VMM intervention.

Software VMM vs. Hard-ware VMM

Page 25: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2510/1/2009

Software VMM – VMware Player 1.0.1

Hardware VMM – VMware implemented experimental hardware assisted VMM.

Host – HP workstation, VT-enabled 3.8 GHz Intel Pentium

All experiments are run natively, on software VMM and on Hardware-assisted VMM.

Experiments

Page 26: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2610/1/2009

Test to stress process creation and destruction system calls, context switching, page table modifica-

tions, page faults, etc. Results – to create and destroy 40,000 processes

Host – 0.6 seconds Software VMM – 36.9 seconds Hardware VMM – 106.4 seconds

Forkwait Test

Page 27: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2710/1/2009

Benchmark Custom guest OS – FrobOS Tests performance of single vir-

tualization sensitive operation Observations

Syscall (Native == HW << SW) Hardware – No VMM interven-

tion in so near native Software – traps

in (SW << Native << HW) Native – access a off-CPU regis-

ter Software VMM – translates “in”

into a short sequence of instruc-tions that access virtual model of the same.

Hardware – VMM intervention

Nanobenchmarks

Page 28: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2810/1/2009

Observations (Cont’d) ptemod (Native << SW <<

HW) Both use shadowing tech-

nique to implement guest paging using traces for co-herency

PTE writes causes signifi-cant overhead compared to native

Nanobenchmarks (Cont’d)

Page 29: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

29

Classic Virtualization

Software Virtualization

Hardware Virtualization

Comparison and Results

Discussion

Outline

10/1/2009

Page 30: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3010/1/2009

Microarchitecture Hardware overheads will shrink over time as implementations ma-

ture. Measurements on desktop system using a pre-production version

Intel’s Core microarchitecture. Hardware VMM algorithmic changes

Drop trace faults upon guest PTE modification, allowing temporary incoherency with shadow page tables to reduce costs.

Hybrid VMM Dynamically selects the execution technique

Hardware VMM’s superior system call performance Software VMM’s superior MMU performance

Hardware MMU support Trace faults, context switches and hidden page faults can be han-

dled effectively with hardware assistance in MMU virtualization.

Opportunities

Page 31: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3110/1/2009

Hardware extensions allow classical virtualization on x86 architecture.

Extensions remove the need for Binary Translation and simplifies VMM design.

Software VMM fares better than Hardware VMM in many cases like context switches, page faults, trace faults, I/O.

New MMU algorithms might narrow the gap in per-formance.

Conclusion

Page 32: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3210/1/2009

Benchmarks Apache ab benchmarking tool

– on Linux installation of Apache http server and on Windows installation

Tests I/O efficiency Observations

Both VMMs perform poorly Performance on Windows and

Linux differ widely Reason: Apache Configuration

Windows – single address space (less paging)

Hardware VMM is better Linux – multiple address spa-

ces (more paging) Software VMM is better

Server Workload

Page 33: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3310/1/2009

Benchmark PassMark on Windows XP Pro-

fessional The suite of microbenchmarks

test various aspects of work-station performance.

Observations Large RAM test

Exhausts memory. (paging capabilities)

Intended to test paging capa-bility.

Software VMM is better. 2D Graphics test

Involves system calls. Hardware VMM is better.

Desktop-Oriented Workload

Page 34: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3410/1/2009

Compilation times Linux kernel and Apache (on Cyg-win)

Observation Big compilation jobs – lots

of page faults. Software VMM is better in

handling page faults.

Less Synthetic Workload

Page 35: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3510/1/2009

Page 36: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3610/1/2009

Page 37: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3710/1/2009

Page 38: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3810/1/2009

Page 39: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

3910/1/2009

Page 40: Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

4010/1/2009