our work on virtualization chen haogang, wang xiaolin {hchen, wxl}@pku.edu.cn institute of network...

24
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering and Computer Science Peking University

Upload: ethelbert-hudson

Post on 01-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Our work on virtualization

Chen Haogang, Wang Xiaolin

{hchen, wxl}@pku.edu.cn

Institute of Network and Information Systems

School of Electrical Engineering and Computer Science

Peking University

2008.11

Page 2: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Agenda Work at PKU

Remote paging for VM Transparent paravirtualization Virtual resource sharing Cache management in Multi-Core and

Virtualization environment

Page 3: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

REMOCA: Hypervisor Remote Disk Cache

Motivation To improve paging performance for memory-

intensive or I/O-intensive workloads by utilizing free memory resource on another physical machine

Solution The remote memory plays the role of a storage

cache between a VM’s virtual memory and its virtual disk devices.In most cases, the network latency is much

lower than the disk latency (1~2 magnitude)

Page 4: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

REMOCA: The design of REMOCA

Local Module: a ghost buffer REMOCA is an exclusive cache

Remote Module: the memory service

http://ncis.pku.edu.cn

Page 5: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Summary of REMOCA REMOCA can efficiently alleviate the impact of

thrashing behavior, and also significantly improve the performance for real-world I/O intensive applications.

Future work Cluster-wide memory balancing Predicting miss ratio before allocating

http://ncis.pku.edu.cn

Remote cache size: 768MB

Page 6: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

主要内容 Work at PKU

Remote paging for VM Transparent paravirtualization Virtual resource sharing Cache management in Multi-Core and

Virtualization environment

Page 7: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Transparent paravirtualization

Some limitation of current hardware-assistant virtualization Too many VM exits incur significant overhead.

Most VM exits are related to page fault or I/O operation

KVM Xen (HVM)

Page fault I/OTotal

Page fault I/OTotal

count % count % count % count %

Kernel Compile

7459199 83.36 768433 8.59 8948235 11219395 81.23 1247207 9.03 13812089

SpecJBB 329286 2.99 6560472 59.57 11012466 352520 4.22 5000529 59.89 8350197

SpecCPU 34215800 18.69 102468771 55.98 183041637 113793259 50.41 56215828 24.90 225727994

SpecWeb 5668822 15.50 14921351 40.79 36584589 11835408 31.46 12428086 33.04 37618375

Reason and count of VM exits

Page 8: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Top 10 trap instructions with high VM exits frequency in KVM-54

Kernel Compile SpecJBB SpecCPU SpecWeb

1 26.21% (pf) 32.48% (io) 12.54% (pf) 7.65% (io)

2 26.07% (pf) 32.47% (io) 9.02% (io) 7.46% (io)

3 3.71% (pf) 32.47% (io) 8.47% (io) 7.46% (io)

4 2.44% (rd cr) 0.31% (io) 8.17% (ot) 7.46% (io)

5 2.44% (rd cr) 0.31% (pf) 8.01% (io) 6.08% (rd cr)

6 2.16% (pf) 0.31% (ot) 7.44% (io) 5.77% (io)

7 1.91% (pf) 0.31% (io) 7.44% (io) 5.71% (ot)

8 1.14% (pf) 0.31% (io) 7.44% (io) 5.67% (hlt)

9 0.99% (io) 0.31% (io) 4.54% (io) 5.53% (clts)

10 0.99% (io) 0.18% (ot) 2.07% (ot) 5.50% (rd cr)

TotalTotal 68.06%68.06% 99.46%99.46% 75.14%75.14% 64.30%64.30%

http://ncis.pku.edu.cn

(io: I/O operation, pf: page fault, rd cr: read control registers, clts and hlt: x86 instructions, ot: others)

Page 9: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Hot Instructions detection and translation

How to reduce VM exits

Paravirtualization Xen and KVM apply paravirtualization to improve performance.

The needs to change the source code damages its applicability.

Transparent paravirtualization Detecting Hot Instructions

An efficient mechanism to catch 97% with top 64 instructions

Replacing Hot Instructions

New or even complex assistant mechanisms should be introduced into VMM to make the replacement safe and possible

Implanting Replaced Instructions to Guest OS

Adaptive code implantation

Page 10: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Implementation in KVM

http://ncis.pku.edu.cn

Host OS

KVM HIK

IOCTL

Guest OS

VM exits

Page Directory

Function implant

Implanted Function Space

TMPEngine

Hot Instruction Killer:•Analyze hot instructions•Trace call stacks•Generate code fragments and new functions

TMP Engine:•Turn on/off TMP•Detect hot instructions•Adaptive code implantation

Page Table Page

Page 11: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Transparent Memory Paravirtualization

A New Memory Virtualization Mechanism Transfers the guest OS page table to map guest virtual addresses

directly to host physical addresses. The transferred guest page table, called direct page table, is

directly registered with the hardware MMU. A process using direct page table is called as a para-virtualized

process.

To provide the guest OS an independent view of its own physical address space as used for guest OS memory management. When the guest OS accesses the direct page table, it expects guest

physical addresses rather than host physical addresses as currently presented in the direct page table.

http://ncis.pku.edu.cn

Page 12: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Transparent Memory Paravirtualization

http://ncis.pku.edu.cn

Page Directory

RecoveryTable

page

page

PTE

P-PTE

Original PTE

2

1

Page TablePages

The Direct page table structure of A New Memory Virtualization Mechanism

Page 13: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Evaluation

Page 14: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Evaluation

Page 15: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Future work TMP Evaluation

Impact on cache hits Compares with: EPT/NPT , Shadow Page Table Compares with: KVM Para-MMU , Xen Para-MMU

Transparent MMU Extension Linux Windows Emulate all Guest OS page faults

TMP Transparent Para-IO Other hot instructions

Limitation of Transparent Paravirtualization

Security vs. performance

Transparent Paravirtualization

Page 16: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Agenda Work at PKU

Remote paging for VM Transparent paravirtualization Virtual resource sharing Cache management in Multi-Core and

Virtualization environment

Page 17: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Virtual resource sharing

Motivation

In a homogeneous environment, how to achieve high-degree of resource sharing while preserving isolation?

Example: Network classroom @ PKU Zhongzhi

Teaching Windows, MS Office or VC++ programming

About 30 students per class

Homogeneous OS, software,data and application instances

Terminal Terminal Terminal Terminal

Server Server

Page 18: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Virtual resource sharing Limitations of current solutions

Terminal server: bad isolation Preferred to run a single OS per student

VM live clone: cannot provide data persistency Content-based sharing: high scanning overhead Difference Engine (OSDI ’08): unable to share

during OS startup or application startup

Goal

Fast startup of VMs and applications Accurate resource sharing Low management overhead

Page 19: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Virtual resource sharing Solution: a bottom-up approach

Starts from disk sharing Map identical disk blocks to a single storage

location Manage a shared disk cache within the VMM Replace disk reads with page remapping

Fast application startup Challenges

How to discover identical disk blocks? CoW disk / CAS

How to handle sharable application data, especially the “zero pages”?

Page 20: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Agenda

Work at PKU Remote paging for VM Transparent para-virtualization Virtual resource sharing Cache management in Multi-Core and

Virtualization environment

http://ncis.pku.edu.cn

Page 21: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Motivation Current VMM cannot make efficient use of

the cache hierarchy in a multi-core platform

Objectives Explore new compiling and profiling techniques

to analyze and predict memory access behavior of a program

Implement the cache-aware memory allocation and CPU scheduling in the VMM

Dynamic memory balancing among VMs

Cache management in Multi-Core

Page 22: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Lower-level cache partitioning Avoid cache contention for concurrent VMs Using page-coloring technique

Restricting the number of cache sets that a VM can use

Transparent to the guest OS

Cache management in Multi-Core

Page 23: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

http://ncis.pku.edu.cn

Challenges Predicting the performance impact to the

application before partitioning Online profiling and dynamic re-partitioning Reducing page migration overhead Cooperating with VM scheduling, especially

CPU allocation and migration New micro-architectures

Example: Intel Nehalem256 KB dedicated L2 per core and shared L3

Cache management in Multi-Core

Page 24: Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, wxl}@pku.edu.cn Institute of Network and Information Systems School of Electrical Engineering

Thanks!

Q&A

Discussion