memory-efficient virtual machine high availability
DESCRIPTION
Memory-efficient Virtual Machine High Availability. Karen Kai-Yuan Hou Prof. Kang G. Shin University of Michigan Mustafa Uysal (VMware) Arif Merchant (HP Labs) Sharad Singhal (HP Labs). Protect VM from Host Failures. Set up backup by primary VM replication - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/1.jpg)
1
Memory-efficient Virtual Machine High Availability
Karen Kai-Yuan HouProf. Kang G. Shin
University of Michigan
Mustafa Uysal (VMware)Arif Merchant (HP Labs)Sharad Singhal (HP Labs)
![Page 2: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/2.jpg)
2
Protect VM from Host Failures
• Set up backup by primary VM replication• Backup takes over execution promptly if primary fails
• High memory costE.g. To protect a 1G VM, an additional 1G memory is reserved to just hold the backup.
App 1
Primary VM
Hypervisor
Primary Host
App 2
App 1
Backup VM
Hypervisor
Backup Host
App 2Physical Host Failure
![Page 3: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/3.jpg)
3
Use a Shared Storage
• “Maintain” backup VM in storage instead of RAM• Improve resource and energy efficiency. Recover anywhere.
Other primary (active) VM
Other primary (active) VM
App 1
Primary VM
Hypervisor
App 2
Host 1Hypervisor
Host 2
Shared Storage
HypervisorHost 2
Hypervisor
Host n
App 1
Backup VM
App 2
App 1
Primary VM
HypervisorHost 1
App 2
![Page 4: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/4.jpg)
4
Protection: Tracking Primary VM State
• Take checkpoints of the primary VM– Incremental, periodic, copy-on-write checkpoints
Primary VM
App 1App 2
VM memory space
VM Fail-over Image
![Page 5: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/5.jpg)
5
Fail-over: Bringing Up Backup VM
• Slim VM Restore – Load only necessary information
and switch on backup VM quickly– Fetch pages on-demand as the
backup VM executes
VM Fail-over Image
Restored backup VM
App 1App 2
VM memory space
![Page 6: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/6.jpg)
6
Improving I/O Efficiency with SSDs
• Small, random I/O’s are more efficient on SSDs
Primary Side
Updating the VM image continuously.
Restore Side
Fetching from the VM image on-demand.
VM Fail-over Image
small, random writes small, random reads
![Page 7: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/7.jpg)
7
Preliminary Evaluation
• Prototype built on Xen 3.3.2• Questions– How much overhead does continuous checkpointing
introduce on the primary VM?– How does the shared storage support continuous updating
of the fail-over image?– How quickly can our system bring up a backup VM?– How does the backup VM perform when it executes by
fetching pages on-demand?
![Page 8: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/8.jpg)
8
Checkpointing Overheads
• Kernel Compilation • RUBiS
Every 10s Every 5s Every 2s0
5
10
15
20
25
30
35
40
Overhead (%)
Every 10s Every 5s Every 2s0
1
2
3
4
5
6
7 HDHD, COWSSDSSD, COW
Overhead (%)
![Page 9: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/9.jpg)
9
CoW and SSD Enhancements
• CoW reduces VM pause time for taking checkpoints
• Checkpoints commit faster on a SSD
Every 10s Every 5s Every 2s0
50
100
150 w/o COWCOW
Pause Time (ms)
Every 10s Every 5s Every 2s0246
Commit Time (sec)
HD SSD
![Page 10: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/10.jpg)
10
Fail-over Time and Demand Fetching
• Time required to bring up a backup VM
• Overheads of fetching VM pages on-demand
Kernel Compilation RUBiS Video Transcoding0
0.51
1.52
Fail-over Time (sec)
HD SSD
Kernel Compilation RUBiS Video Transcoding05
1015
Overhead (%)
HDSSD
![Page 11: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/11.jpg)
11
Interesting Observations:Page Fetching Behavior
• How a VM uses (demand fetches) its pages while compiling a kernel:
![Page 12: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/12.jpg)
12
Interesting Observations:Page Fetching Behavior
• What actually happens on disk (recorded by blktrace):
![Page 13: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/13.jpg)
13
Conclusions
35
113 ms 10.1 ms 10.1 ms
20 s 20 s 20 s
1.47 s
save restore
35 s
![Page 14: Memory-efficient Virtual Machine High Availability](https://reader036.vdocuments.site/reader036/viewer/2022081008/568131a5550346895d98154e/html5/thumbnails/14.jpg)
14
• Thank you!