we are living in a new virtualized world sorav bansal iit delhi feb 26, 2011
TRANSCRIPT
We are living in aNew Virtualized World
Sorav BansalIIT Delhi
Feb 26, 2011
Old Virtualized World
IBM Mainframes (circa 1960)
IBM Mainframe
VMM
OS OS OS
App AppAppAppApp App
New Virtualized World
“Cloud-OS”
OS OS OS
App AppAppAppApp App
“Cloud-OS”(stuff that you have heard many times before… uh yawn…)
• Infrastructure Layer (slave) + Management layer (master)
• Divide hardware into resource pools• Unit of abstraction = VM• Efficient• Effective Isolation• Dynamic• Fault-Tolerant
“Cloud-OS”(more exciting stuff)
• Dynamic Performance Optimizations– Compiler Optimizations– OS-level Optimizations
• Providing Determinism– Efficient Para-virtual Record/Replay
• Improving Reliability– Micro-Replays
“Cloud-OS”(more exciting stuff)
• Security– VMM-level security checks
• Efficient Thin Clients– Remote Desktopping using VM Record/Replay
Performance Optimizations
• Dynamic Binary Translation (Compiler Optimizations)– Translation Blocks– Direct Jump Chaining– Peephole Optimizations– Trace Optimizations– Exception rollbacks– Interrupt delays
Performance Optimizations
• Dynamic Binary Translation (OS-level Optimizations)– Eliminate traps from system calls– Better TLB/cache locality by using dedicated OS
cores
Traditional Picture
OS
Hardware
Application1 Application
2
Virtualized Picture
OS
Application1 Application
2
Optimizing VMM
bubsort
emptyloop
fibo_iter
hanoi1
hanoi2
hanoi3
printf
-15
-10
-5
0
5
10
15
-0.8%
0.3% 0.3%
12.9%
-3.1%
9.1%
Ove
rhea
d (P
erce
ntag
e of
Nati
ve)
Lower is Better
Some Initial Results
Providing Determinism: Record/Replay
• Uniprocessor– Non-determinism is quite low. Can be efficiently
recorded.
• Multiprocessor– Non-determinism high due to shared memory.– Recording overhead scales poorly with multiple
processors– Assuming we can patch the guest in some way, can
we improve this situation?
Micro-Replays
Snapshot
Recording non-determinism
Hit a Bug(e.g., assertion failure)
Execution timeline
Replay
Choose a rollback point.Also guess bug-inducingnon-deterministic choice
Potentially bug-freeexecution
Tolerating Non-deterministic Bugsusing Record/Replay
debit = 0;credit = total;
void transfer(void) { for (i = 0; i < 1000; i++) { debit--; credit++; assert(debit + credit == total); }}
for (t = 0; t < max_threads; t++) { thread_create(transfer);}
shared vars
unprotectedcriticalsection
VMM records an execution
On assert failure, the VMM interposes and rolls back the execution a few milliseconds
VMM guesses the non-deterministic choices that could have caused the failure (e.g., instruction at timer interrupt)
VMM replays the execution avoiding the previous non-deterministic choices
In this example, VMM infers thecritical section after a few runs and avoids interrupting it
Number of Replays Required?
360 720 1080 1440 1800 2160 2520 2880 3240 36000
5
10
15
20
25
30
35
Critical Section Size (bytes)
Num
ber o
f mic
ro-r
epla
ys re
quire
d
Technical Report:Micro-Replays: Improving Reliability in Presence of Non-deterministic Software Bugshttp://www.cse.iitd.ac.in/~sbansal/pubs/micro_replays.pdf
Security ExampleA Simple Scheme to Prevent Stack-Overflows
call
ret
…push ra, shadow…
…ra popra1pop shadowif (ra != ra1) error…
Remote Desktop Using Streaming VM Record/Replay
Typical Remote Desktop
Remote Desktop Using Streaming VM Record/Replay
Record
Replay
Remote Desktop using Streaming VM Rec/Rep
Bandwidth Comparison
Cumulative Data Transfer as function of time
Steady-state Bandwidth ComparisonRa
te (M
iB/s
)
Steady State Bandwidth Requirement
Conclusions
• We are living in a new virtualized world– Many implications in different application areas
Backup Slides
Translation Blocks
• Divide code into “translation blocks”– A translation block ends if• Reach a control-flow instruction• Or, MAX_INSNS instructions have been translated
A Simple Scheme
Original code
fragment
BinaryTranslator
x:Translated
code fragment
tx:
Use a Cache
Original code
fragment
BinaryTranslator
x:Translated
code fragment
tx:
TranslationCache
Lookup using xsavefound
not-found
Direct Jump Chaining
a
b c
d
Ta
Tb Tc
Td
lookup(b) lookup(c)
lookup(d) lookup(d)
Indirect Jumps
a
bf
call
ret
Ta
Tf
Tb
lookup(retaddr)
push bjmp Tf
pop retaddr
tmp JTABLE[retaddr & MASK]if (tmp.src == retaddr) goto tmp.dst
bubsort
emptyloop
fibo_iter
hanoi1
hanoi2
hanoi3
printf
-15
-10
-5
0
5
10
15
-0.8%
0.3% 0.3%
12.9%
-3.1%
9.1%
Ove
rhea
d (P
erce
ntag
e of
Nati
ve)
Lower is Better
euclid fibo_rec erastothenes0
50
100
150
200
250
300
350
400
156.7%
114.3%
37.4%
36x 46x 11x
defaultno jumptable
Ove
rhea
d (P
erce
ntag
e of
Nati
ve)
Lower is Better
fibo_rec
printf
1.1x45x
710x
no chainingno jumptabledefault
Series1
-9%
-6%
58%no chainingno jumptabledefault
printf
Overheads
logarithmic scale
1 3 5 7 9 11 13 15 17 19 21 23-50
0
50
100
150
200
250
bubsort
1 3 5 7 9 11 13 15 17 19 21 230
50
100
150
200
emptyloop
1 3 5 7 9 11 13 15 17 19 21 230
400
800
1200
1600
2000
euclid
1 3 5 7 9 11 13 15 17 19 21 230
50
100
150
200
250
fibo_iter
1 3 5 7 9 11 13 15 17 19 21 230
100200300400500
fibo_rec
1 3 5 7 9 11 13 15 17 19 21 230
50100150200250300
hanoi1
1 3 5 7 9 11 13 15 17 19 21 23-50
50
150
250
350
hanoi2
1 3 5 7 9 11 13 15 17 19 21 230
50100150200250300350
hanoi3
1 3 5 7 9 11 13 15 17 19 21 23-100
0
100
200
300
400
500
printf
Effect of Maximum Size of Translation Block
Max Size of Translation Block
Ove
rhea
d
Effect of Translation Cache Size
16 17 18 20 22 24 32 64 96 128-505
1015202530
bubsort
16 17 18 20 22 24 32 64 96 1280
10
20
30
40
emptyloop
16 17 18 20 22 24 32 64 96 1280
50100150200250300
euclid
16 17 18 20 22 24 32 64 96 128012345678
fibo_iter
16 17 18 20 22 24 32 64 96 12805
1015202530354045
hanoi1
16 17 18 20 22 24 32 64 96 128
-4-202468
1012
hanoi2
16 17 18 20 22 24 32 64 96 1280
5
10
15
20
25
30
hanoi316 17 18 20 22 24 32 64 96 128-50
050
100150200250300
printf
16 17 18 20 22 24 32 64 96 1280
20
40
60
80
erastothenes
Ove
rhea
d
Number of 4k pages in Translation Cache
clockrandom
Optimizations
• Peephole Optimizations• Trace Optimizations• Cross-layer optimizations
An Example
ld M, r1ld M, r0
ld M, r0mov r0, r1
Interrupts
ld M, r1ld M, r0
ld M, r0mov r0, r1
Delay Interrupt delivery till end ofcurrent translation
Precise Exceptions
ret ld (sp),t0add $4, sp…jmp t0 Page fault
sub $4, sprestore t0
rollback code
page fault handler