qemu introduction

52
QEMU Intro. Chiawei, Wang 2015/07/17 1

Upload: chiawei-wang

Post on 22-Jan-2018

2.995 views

Category:

Software


1 download

TRANSCRIPT

QEMU Intro.Chiawei, Wang

2015/07/17

1

Story Time

• Emulation V.S. Virtualization• Why we love emulator ?

• QEMU

• ISA translation of QEMU• Guest Insn. Intermediate Representation Host Insn.

• Code Block Translation

• Translation Block Cache

• Translation Block Chaining

• Helper Feature of QEMU

2

Emulation V.S. Virtualization

• Both can be used to host VM (a.k.a hypervisor)

• Virtualization• Share the underlying hardware as disjoin set for each VM instance.

• Host ISA is the same as Guest ISA.

• Guest operations can be directly dispatched to hardware• Fast

• Emulation• Everything of Guest ISA are realized by software.

• Register, Memory, I/O

• Host ISA can be differ from Guest ISA.

• Guest operations are translated into operations to the emulated devices• Slow

3

Why we LOVE Emulator ?

• Everything is implemented by software!

• Everything can be customized on demands!• Welcome to the code-tracing hell….

• Popular emulator• QEMU

• Bochs

• QEMU is preferred in our use due to its better performance.• We will give more details later.

4

QEMU

• QEMU, a Fast and Portable Dynamic Translator• http://static.usenix.org/event/usenix05/tech/freenix/full_pap

ers/bellard/bellard.pdf

• Supporting numerous ISA emulation• i386• x86_64• arm• mips• ppc• Etc.

• QEMU is also the client of Linux KVM (virtualization).• Herein we focus only on its emulation functionality.

5

QEMU Snapshot & Console

6

Before we dig into QEMU … Let us go through an emulation of a code snippets for example.

7

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 0

R1 0

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

8

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 0

R1 0

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 0

ECX 0

Emulate R0, R1 w/ EAX, ECX

9

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 1

R1 0

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 0

ECX 0

Emulate R0, R1 w/ EAX, ECX

10

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 1

R1 0

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 1

ECX 0

Emulate R0, R1 w/ EAX, ECX

translate

code:

B8 01 00 00 00 MOV EAX, 0x1

11

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 1

R1 2

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 1

ECX 0

Emulate R0, R1 w/ EAX, ECX

code:

B8 01 00 00 00 MOV EAX, 0x1

12

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 1

R1 2

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 1

ECX 2

Emulate R0, R1 w/ EAX, ECX

translate

code:

B8 01 00 00 00 MOV EAX, 0x1

B9 02 00 00 00 MOV ECX, 0x2

13

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 3

R1 2

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 1

ECX 2

Emulate R0, R1 w/ EAX, ECX

code:

B8 01 00 00 00 MOV EAX, 0x1

B9 02 00 00 00 MOV ECX, 0x2

14

Example of Emulation

• Emulate ARM Guest on x86 Host

ARM

Register Value

R0 3

R1 2

code:

E3 A0 00 01 MOV R0, #1

E3 A0 10 02 MOV R1, #2

E0 80 00 01 ADD R0, R0, R1

x86

Register Value

EAX 3

ECX 2

Emulate R0, R1 w/ EAX, ECX

code:

B8 01 00 00 00 MOV EAX, 0x1

B9 02 00 00 00 MOV ECX, 0x2

01 C8 ADD EAX, ECX

translate

15

Translation Between Different ISA

Guest Host

16

Translation Between Different ISA

Guest Host

code translation

17

Translation Between Different ISA

Guest Host

code translation

18

Translation Between Different ISA

Guest Host

code translation

19

Translation Between Different ISA

Guest Host

code translation

20

Translation Between Different ISA

Guest Host

code translation

Are you fucking kidding me ?

21

Translation Between Different ISA of QEMU• QEMU adopts an abstraction layer between the

translation.• Tiny Code Generator (TCG), an intermediate

representation (IR) code.

Guest Host

TCG

22

Translation Between Different ISA of QEMU• QEMU adopts an abstraction layer between the

translation.• Tiny Code Generator (TCG), an intermediate

representation (IR) code.

Guest Host

TCG

23

Translation Between Different ISA of QEMU• QEMU adopts an abstraction layer between the

translation.• Tiny Code Generator (TCG), an intermediate

representation (IR) code.

Guest Host

TCG

24

Translation Between Different ISA of QEMU• QEMU adopts an abstraction layer between the

translation.• Tiny Code Generator (TCG), an intermediate

representation (IR) code.

Guest Host

TCG

25

Example of QEMU Code Translation

Guest Code TCG

mov eax, ds

mov_i64 tmp0, rax

movi_i64 tmp3, 0xfd194

st_i64 tmp3, env, 0x80

mov_i32 tmp5, tmp0

movi_i32 tmp11, 0x3

call load_seg, 0x0, 0, env, tmp11, tmp5

movi_i64 tmp3, 0xfd196

st_i64 tmp3, env, 0x80

exit_tb 0x0

set_label L0

exit_tb 0x7f77499ff3cb

• x86_64 (Guest) TCG

26

Example of QEMU Code Translation

TCG Host Code

mov_i64 tmp0, rax

movi_i64 tmp3, 0xfd194

st_i64 tmp3, env, 0x80

mov_i32 tmp5, tmp0

movi_i32 tmp11, 0x3

call load_seg, 0x0, 0, env, tmp11, tmp5

movi_i64 tmp3, 0xfd196

st_i64 tmp3, env, 0x80

exit_tb 0x0

set_label L0

exit_tb 0x7f77499ff3cb

mov rax, 0x3

mov [0x7f779478f008], rax

mov [r14 + 0x80], 0xfd194

mov rdi, r14

mov esi, 0x3

mov edx, 0x10

call 0x7f776ce8e500

mov [r14 + 0x80], 0xfd196

xor eax, eax

jmp 0x7f776a9fec16

lea rax, [rip – 0x110005ed]

jmp 0x7f776a9fec16

• TCG x86_64 (Host)

27

Example of QEMU Code Translation

TCG Host Code

mov_i64 tmp0, rax

movi_i64 tmp3, 0xfd194

st_i64 tmp3, env, 0x80

mov_i32 tmp5, tmp0

movi_i32 tmp11, 0x3

call load_seg, 0x0, 0, env, tmp11, tmp5

movi_i64 tmp3, 0xfd196

st_i64 tmp3, env, 0x80

exit_tb 0x0

set_label L0

exit_tb 0x7f77499ff3cb

mov rax, 0x3

mov [0x7f779478f008], rax

mov [r14 + 0x80], 0xfd194

mov rdi, r14

mov esi, 0x3

mov edx, 0x10

call 0x7f776ce8e500

mov [r14 + 0x80], 0xfd196

xor eax, eax

jmp 0x7f776a9fec16

lea rax, [rip – 0x110005ed]

jmp 0x7f776a9fec16

• TCG x86_64 (Host)

28

I just wanna executemov eax, ds

Code Block-based Translation

• First thought of code translation• Interpret each encountered Guest instruction and

execute the translated code in Host (Bochs’ way).

• Recall the emulation example in page 7.

• QEMU use code block-based translation instead of one-by-one interpretation.• Performance improvement

29

What is Code Block

• Code Block/Basic Block (also called Translation Block in QEMU)• A collection of instructions that can be SEQUENTIALLY executed.

• Each block is ended with a control-flow transfer instruction.

30

Performance Improvement

31

• Translation block optimizationmov eax, 1

add eax, 2

mov eax, 3

• Translation block cache (coming up)

Performance Improvement

32

• Translation block optimizationmov eax, 1

add eax, 2

mov eax, 3

• Translation block cache (coming up)

Translation Block Cache

• Since executing code doesn’t change often, why don’t we stop translating the code previously translated ?

33

Translation Block Cache

• Since executing code doesn’t change often, why don’t we stop translating the code previously translated ?

• YES! QEMU caches the translation block and index it with the Guest physical address where the code resides in.

34

Translation Block Cache

• Workflow

35

main:

mov dword ptr [esp+18], 0

mov dword ptr [esp+14], 80

mov dword ptr [esp+10], 1

mov dword ptr [esp+C], 0

mov dword ptr [esp+8], 0

mov dword ptr [esp+4], C0000000

mov dword ptr [esp], 00404020

mov eax, dword ptr[00406120]

call eax // f = CreateFileA( … )

sub esp, 1C

mov dword ptr[ebp-C], eax

cmp dword ptr[ebp-C], -1

jnz short 00401557 // if(f == -1)

mov eax, -1

jmp short 0040156C // return -1

mov eax, dword ptr [ebp-C]

mov dword ptr [esp], eax

mov eax, dword ptr[0040611C]

call eax // CloseHandle( f )

sub esp, 4

mov eax, 0

mov ecx, dword ptr [ebp-4]

leave // return 0

EIP = 0x11223344

GVA = 0x11223344

GPA = 0x5566

Is TB_cache[GPA]Valid ?

Execute the TB

Code Translationmov dword ptr [esp+18], 0

mov dword ptr [esp+14], 80

mov dword ptr [esp+10], 1

mov dword ptr [esp+C], 0

mov dword ptr [esp+8], 0

mov dword ptr [esp+4], C0000000

mov dword ptr [esp], 00404020

mov eax, dword ptr[00406120]

call eax

TB_cache[GPA] = TB

GVA: Guest Virtual AddressGPA: Guest Physical Address

Guest Host

True

False

Lookup the Guestpage table for GVA

TB (Translated code inside)

Translation Block Cache

• Cache space is limited.

• Policy for cache replacement upon full cache is required.

36

Translation Block Cache

• Cache space is limited.

• Policy for cache replacement upon full cache is required.

37

• Assume TB1, TB2, and TB3 are all cached and going to be sequentially executed.

• Six control flow transfer.

Translation Block Chaining

38

QEMU

TB 1

TB 2

TB 3

Find TB1 in cache & Exec

Return

Find TB2 in cache & Exec

Return

Find TB3 in cache & Exec

Return

Time

• When TB1, TB2, and TB3 are executed sequentially in most case …

• Four control flow transfer. Faster

Translation Block Chaining

39

QEMU

TB 1

TB 2

TB 3

Find TB1 in cache & Exec

Return

Find TB2 in cache & Exec

Return

Find TB3 in cache & Exec

Return

Time

• What if the end of a TB is a conditional branch ? ( e.g. JCC group of x86 )• Each TB has two slots for chaining

Translation Block Chaining

40

TB 1

TB 2 TB 3

True Chain

False Chain

41

So far so good ?

Helper Feature of QEMU

• Helper makes the TB execution be transferred immediately to C-function Host code.

• Advantage• Ease of the burden of coding on complex code

translation

• Interception during TB execution

• Disadvantage• Overhead caused by transmitting the QEMU state from

“executing translated Guest code” to “executing Host code”

42

Example of Helper Use

43

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

• x86_64 TCG x86_64

Example of Helper Use

44

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

• x86_64 TCG x86_64

compile

helper_div:

0x7f776ce80440: push rbp

0x7f776ce80441: mov rbp, rsp

Example of Helper Use

45

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

div ecx

• x86_64 TCG x86_64

compile

helper_div:

0x7f776ce80440: push rbp

0x7f776ce80441: mov rbp, rsp

Example of Helper Use

46

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

div ecx

mov_i64 tmp0,rcx

movi_i64 tmp3,$0xf0544

st_i64 tmp3,env,$0x80

call divl_EAX,$0x0,$0,env,tmp0

• x86_64 TCG x86_64

compile

helper_div:

0x7f776ce80440: push rbp

0x7f776ce80441: mov rbp, rsp

Translation by calling gen_helper_div

Example of Helper Use

47

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

div ecx

mov_i64 tmp0,rcx

movi_i64 tmp3,$0xf0544

st_i64 tmp3,env,$0x80

call divl_EAX,$0x0,$0,env,tmp0

movq $0xf0544,0x80(%r14)

mov %r14,%rdi

mov $0xa,%esi

callq 0x7f776ce80440

• x86_64 TCG x86_64

compile

helper_div:

0x7f776ce80440: push rbp

0x7f776ce80441: mov rbp, rsp

Translation by calling gen_helper_div

Generate Host Code for div ecx emulation

Example of Helper Use

48

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

div ecx

mov_i64 tmp0,rcx

movi_i64 tmp3,$0xf0544

st_i64 tmp3,env,$0x80

call divl_EAX,$0x0,$0,env,tmp0

movq $0xf0544,0x80(%r14)

mov %r14,%rdi

mov $0xa,%esi

callq 0x7f776ce80440

• x86_64 TCG x86_64

compile

helper_div:

0x7f776ce80440: push rbp

0x7f776ce80441: mov rbp, rsp

Translation by calling gen_helper_div

Generate Host Code for div ecx emulation

Example of Helper Use

49

qemu/helper.c

void helper_div( arg1, arg2 )

{

// Do the division job &

// Update the emulated Guest CPU/Memory

}

Guest Code TCG IR Host Code

div ecx

mov_i64 tmp0,rcx

movi_i64 tmp3,$0xf0544

st_i64 tmp3,env,$0x80

call divl_EAX,$0x0,$0,env,tmp0

movq $0xf0544,0x80(%r14)

mov %r14,%rdi

mov $0xa,%esi

callq 0x7f776ce80440

• x86_64 TCG x86_64

compile

helper_div:

0x7f776ce80440: push rbp

0x7f776ce80441: mov rbp, rsp

Translation by calling gen_helper_div

Generate Host Code for div ecx emulation

When does QEMU Use Helper ?

• Translating an instruction which results in complex and numerous TCG IR generation.• e.g. div of x86

• Interception to the execution of translated instruction is required. (like hook)• e.g. jcc of x86

• … (might be more cases. I haven’t fully comprehended)

50

More QEMU-related System of DSNSLab• SecMap

• Two-layer Disk Forensics

• MrKIP

• VMaware Detector

• Cloudebug

• ProbeBuilder

• Android Taint

51

Q & A

52