integrating paravirtualization into the linux kernel · 2006. 9. 11. · integrating...
TRANSCRIPT
![Page 1: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/1.jpg)
Integrating Paravirtualization into the
Linux Kernel
Rusty RussellIBM Linux Technology Center
(OzLabs)
![Page 2: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/2.jpg)
Contents• Introduction to Paravirtualization• The Choices: Xen, VMI, Native...• The Solution• The Implementation• Future Work
![Page 3: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/3.jpg)
Introduction to Paravirt.• Operating System on normal
hardware:
![Page 4: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/4.jpg)
Introduction to Paravirt.• Operating System “virtualized” under
hypervisor:
![Page 5: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/5.jpg)
Introduction to Paravirt.• Operating System “virtualized” under
hypervisor:
![Page 6: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/6.jpg)
Introduction to Paravirt.• Full virtualization is easy (for Linux!)
![Page 7: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/7.jpg)
Introduction to Paravirt.• Full virtualization is easy (for Linux!)
![Page 8: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/8.jpg)
Introduction to Paravirt.• Paravirtualization is more efficient
because Operating System cooperates:
![Page 9: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/9.jpg)
The Choices• Various x86 hypervisors exist for
Linux:– Xen
• Xen-specific Linux modifications– VMWare
• “VMI” proposed generic interface– Native hardware
• Linux must also support this!
![Page 10: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/10.jpg)
The Choices• Various x86 hypervisors exist for
Linux: Xen, VMI, Native
![Page 11: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/11.jpg)
The Choices: Xen Patches• Chris Wright's CONFIG_XEN
patches:
![Page 12: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/12.jpg)
The Choices: VMI Patches• Zach Amsden's VMI patches:
![Page 13: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/13.jpg)
The Solution• If we can't agree on a single ABI for
all hypervisors, at least we can have a Linux API:
![Page 14: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/14.jpg)
The Solution: Paravirt Ops• I expect you all to write hypervisors!
![Page 15: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/15.jpg)
The Implementation• “struct paravirt_ops”
– Function pointers for every sensitive instruction.
– Similar to PowerPC's “ppc_md”• Each hypervisor replaces these with
the operations for that interface.– Designed to be as easy to port Linux to
a new hypervisor as writing new Linux driver.
![Page 16: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/16.jpg)
The Implementation• struct paravirt_ops contains function
pointers, eg:– unsigned long (fastcall *save_fl)(void);
void (fastcall *restore_fl)(unsigned long);void (fastcall *irq_disable)(void);void (fastcall *irq_enable)(void);
– Currently 75 functions• Not all hypervisors need all of them!
![Page 17: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/17.jpg)
The Implementation• But indirect function calls can be slow:
– up to 20% slowdown on microbenchmarks (lmbench) on 3GHz P4.
• We can regain this speed by using runtime patching of call sites.– Only need to patch interrupt operations
• These are well over 90% of operations– We already do something similar for SMP
kernels on single processor systems.
![Page 18: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/18.jpg)
The Implementation• “struct paravirt_ops” provides a method
for patching:– Returns length of patch, so remainder can
be padded with NOOPs.– unsigned (*patch)(u8 type, u16 clobber, void *firstinsn,
unsigned len);• type describes which operation• clobber indicates what registers you can use• firstinsn is pointer to instructions• len is length of instructions
![Page 19: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/19.jpg)
The Implementation• Unpatched code looks like this:
– asm volatile("pushl %%ecx; pushl %%edx;" "call *%0;" "popl %%edx; popl %%ecx",
: : "m" (paravirt_ops.irq_disable) : "memory", "eax", "cc");
• This calls paravirt_ops.irq_disable().• It is allowed to use the %eax register.• It is 10 bytes long:
– 2 push (2x1 bytes)– 1 indirect call (6 bytes)– 2 pops (2x1 bytes)
![Page 20: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/20.jpg)
The Implementation• Example: (native)
#define DEF_NATIVE(name, code) \extern const char start_##name[], end_##name[]; \asm("start_" #name ": " code "; end_" #name ":")DEF_NATIVE(cli, "cli");DEF_NATIVE(sti, "sti");DEF_NATIVE(popf, "push %eax; popf");DEF_NATIVE(pushf, "pushf; pop %eax");DEF_NATIVE(pushf_cli, "pushf; pop %eax; cli");DEF_NATIVE(iret, "iret");DEF_NATIVE(sti_sysexit, "sti; sysexit");
static const struct native_insns{const char *start, *end;} native_insns[] = {[PARAVIRT_IRQ_DISABLE] = { start_cli, end_cli },[PARAVIRT_IRQ_ENABLE] = { start_sti, end_sti }, ... };
![Page 21: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/21.jpg)
The Implementation• Example: (native)
unsigned native_patch(u8 type, u16 clobbers, void *insns, unsigned len){unsigned int insn_len;
/* Don't touch it if we don't have a replacement */if (type >= ARRAY_SIZE(native_insns) || !native_insns[type].start)
return len;
insn_len = native_insns[type].end - native_insns[type].start;
if (len < insn_len)return len;
memcpy(insns, native_insns[type].start, insn_len);return insn_len;}
![Page 22: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/22.jpg)
The Implementation• Example: (Xen)
unsigned native_patch(u8 type, u16 clobbers, void *insns, unsigned len){return len;}
![Page 23: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/23.jpg)
The Implementation• Example: (Xen)
– Why no patching?– Their current local_irq_disable():
vcpu_info_t *_vcpu;preempt_disable();_vcpu = &HYPERVISOR_shared_info->vcpu_info [smp_processor_id()];_vcpu->evtchn_upcall_mask = 1;preempt_enable_no_resched();
![Page 24: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/24.jpg)
The Implementation• Example: (Xen)
– If we use %gs to point to per-cpu data, we could do this in one instruction:
• movb $1, %gs:offset
• 8 bytes, no registers needed!– Kernel does not use %gs register currently.– Jeremy Fitzhardinge has patches to
change this
![Page 25: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/25.jpg)
The Implementation• We also need a way to recognize which
hypervisor Linux is booting under.– We branch early in vmlinux entry if we are
not in ring0.– We then call “probe” functions for each
hypervisor type we support.• Xen uses its own entry point at the moment, but
branches to same probing routine.• probe routine populates paravirt_ops struct.
![Page 26: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/26.jpg)
Future Work• Kernel inclusion in 2.6.20?• I expect many more hypervisors to be
implemented in the future.– Linux will be the simplest “real” operating
system to port to such hypervisors.• Some work is being done on an in-tree
example hypervisor (“lhype”)– Should lead to better understanding of
hypervisor issues by Linux Kernel community.
![Page 27: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/27.jpg)
Referenes• Patches can be found here:
– http://ozlabs.org/~rusty/paravirt• Main contributors (so far!):
– Zach Amsden (VMWare)– Jeremy Fitzhardinge (XenSource)– Chris Wright (RedHat)– Rusty Russell (IBM)
![Page 28: Integrating Paravirtualization into the Linux Kernel · 2006. 9. 11. · Integrating Paravirtualization into the Linux Kernel Rusty Russell IBM Linux Technology Center (OzLabs)](https://reader033.vdocuments.site/reader033/viewer/2022051808/600c53cce044ba17d972a4a5/html5/thumbnails/28.jpg)
Legal StatementThis work represents the views of the author(s) and does not necessarily reflect the views of IBM Corporation.The following terms are trademarks or registered trademarks of International Business Machines Corporation in the United States and/or other countries: IBM (logo). A full list of U.S. trademarks owned by IBM may be found at http://www.ibm.com/legal/copytrade.shtml.
Linux is a registered trademark of Linus Torvalds. Other company, product, and service names may be trademarks or service marks of others.