sorav bansal iit delhi
DESCRIPTION
Looking Inside the Virtualization Layer for Performance, Security and Software Fault-Tolerance. Sorav Bansal IIT Delhi. Virtualization Software. VMware Workstation/ESX Server Citrix XenServer Microsoft Hyper-V Virtual Iron Parallels Desktop …. Classification of Virtual Machine Monitors. - PowerPoint PPT PresentationTRANSCRIPT
Sorav BansalIIT Delhi
Looking Inside the Virtualization Layer for Performance, Security
and Software Fault-Tolerance
Virtualization Software
• VMware Workstation/ESX Server• Citrix XenServer• Microsoft Hyper-V• Virtual Iron• Parallels Desktop• …
Classification of Virtual Machine Monitors
• Binary Translation– VMware (1998)
• Hardware-Assisted Virtualization– VMware, Hyper-V, XenServer, Virtual Iron, …
• Para-virtualization– XenServer
Missing Features
• Optimize code• Security• Bug-tolerance
What are we doing
• A virtualization layer for x86 from grounds-up– Runs unmodified OS– Can dynamically optimize code (binary translation)– Can specify security policies enforceable at
instruction-level granularity– Can record and replay an execution– Can install on an existing OS– Transparent to user– Simple
Traditional Picture
OS
Hardware
Application1 Application
2
Virtualized Picture
OS
Application1 Application
2
Optimizing VMM
Translation Blocks
• Divide code into “translation blocks”– A translation block ends if• Reach a control-flow instruction• Or, MAX_INSNS instructions have been translated
A Simple Scheme
Original code
fragmentBinaryTranslator
x:Translated
code fragment
tx:
Use a Cache
Original code
fragmentBinaryTranslator
x:Translated
code fragment
tx:
TranslationCache
Lookup using xsavefound
not-found
Direct Jump Chaining
a
b c
d
Ta
Tb Tc
Td
lookup(b) lookup(c)
lookup(d) lookup(d)
Indirect Jumps
a
bf
call
ret
Ta
Tf
Tb
lookup(retaddr)
push bjmp Tf
pop retaddr
tmp JTABLE[retaddr & MASK]if (tmp.src == retaddr) goto tmp.dst
bubsort
emptyloop
fibo_iter
hanoi1
hanoi2
hanoi3
printf
-15
-10
-5
0
5
10
15
20
-0.8%
0.3% 0.3%
12.9%
-3.1%
9.1%
-9.0%
5.3%
13.0%14.5%
-1.7%
-5.9%
defaultno jumptable
Ove
rhea
d (P
erce
ntag
e of
Nati
ve)
Lower is Better
euclid fibo_rec erastothenes0
50
100
150
200
250
300
350
400
156.7%
114.3%
37.4%
36x 46x 11x
defaultno jumptable
Ove
rhea
d (P
erce
ntag
e of
Nati
ve)
Lower is Better
fibo_rec
printf
1.1x45x
710x
no chainingno jumptabledefault
Series1
-9%
-6%
58%no chainingno jumptabledefault
printf
Overheads
logarithmic scale
1 3 5 7 9 11 13 15 17 19 21 23-50
0
50
100
150
200
250
bubsort
1 3 5 7 9 11 13 15 17 19 21 230
50
100
150
200
emptyloop
1 3 5 7 9 11 13 15 17 19 21 230
400
800
1200
1600
2000
euclid
1 3 5 7 9 11 13 15 17 19 21 230
50100150200250
fibo_iter
1 3 5 7 9 11 13 15 17 19 21 230
100200300400500
fibo_rec
1 3 5 7 9 11 13 15 17 19 21 230
50100150200250300
hanoi1
1 3 5 7 9 11 13 15 17 19 21 23-50
50
150
250
350
hanoi2
1 3 5 7 9 11 13 15 17 19 21 230
50100150200250300350
hanoi31 3 5 7 9 11 13 15 17 19 21 23-100
0100200300400500
printf
Effect of Maximum Size of Translation Block
Max Size of Translation Block
Ove
rhea
d
Effect of Translation Cache Size
16 17 18 20 22 24 32 64 96 128-505
1015202530
bubsort
16 17 18 20 22 24 32 64 96 1280
10
20
30
40
emptyloop
16 17 18 20 22 24 32 64 96 1280
50100150200250300
euclid
16 17 18 20 22 24 32 64 96 128012345678
fibo_iter
16 17 18 20 22 24 32 64 96 12805
1015202530354045
hanoi1
16 17 18 20 22 24 32 64 96 128
-4-202468
1012
hanoi2
16 17 18 20 22 24 32 64 96 1280
5
10
15
20
25
30
hanoi316 17 18 20 22 24 32 64 96 128-50
050
100150200250300
printf
16 17 18 20 22 24 32 64 96 1280
20
40
60
80
erastothenes
Ove
rhea
d
Number of 4k pages in Translation Cache
clockrandom
Optimizations
• Peephole Optimizations• Trace Optimizations• Cross-layer optimizations
An Example
ld M, r1ld M, r0
ld M, r0mov r0, r1
Interrupts
ld M, r1ld M, r0
ld M, r0mov r0, r1
Delay Interrupt delivery till end ofcurrent translation
Precise Exceptions
ret ld (sp),t0add $4, sp…jmp t0 Page fault
sub $4, sprestore t0
rollback codepage fault handler
Security: A Simple Scheme to PreventStack-Overflows
call
ret
…push ra, shadow…
…ra popra1pop shadowif (ra != ra1) error…
Record-Replay
• Record– Direct I/O (in instructions)– Interrupts– Memory-mapped I/O
• Can use this to tolerate certain classes of bugs
Slowdowns with Record/ReplayProgram Slowdown
bubsort 216x
emptyloop 507x
euclid 320x
fibo_iter 282x
fibo_rec 309x
hanoi1 236x
hanoi2 182x
hanoi3 233x
printf 7x
Conclusions
• The virtualization layer is a good place to do many interesting things
• Can we make the virtual machine appear __________________ than the real machine? faster
more secure more reliable