intel® virtualization technology xen architecture lv zheng
TRANSCRIPT
Intel® Virtualization Technology
Xen Architecture
Lv Zheng
2
Objectives
Coverage Xen’s paravirtualization architecture Xen’s hardware virtual machine framework Intel® virtualization technology for IA-32 (VT-x) Intel® virtualization technology for Direct IO (VT-d)
Limitations Do not cover PAE & EM64T
Requirements IA-32 basic architecture
Memory management Interrupts & exceptions Task management Input & output
Quick calculation of 2^N
3
Agenda
Xen’s Architecture Terminology & Concepts Bootstrap Process Physical Memory Management Virtual Memory Management Domains Virtualization
4
Address Terminology
vaddr, virt (virtual address): Translated virtual addresses in Xen heap. Addressing that Xen uses.
maddr, phys (machine address): Real host machine address, the addresses the processor understands.
paddr (physical address): A catch-all for any kind of physical address. "Physical" here can mean guest-physical, machine-physical or guest-machine-physical. See definitions below.
mfn (machine frame number): Corresponding to maddr. gpfn (guest pseudo frame number): Guests run in an illusory contiguous physical
address space, which is probably not contiguous in the machine address space. gmfn (guest machine frame number): Equivalent to GPFN for an auto-translated
guest, and equivalent to MFN for normal paravirtualised guests. It represents what the guest thinks are MFNs.
pfn (physical frame number): Corresponding to paddr.
virt maddr mfn
mfn_to_virt
map_domain_page
__pa / virt_to_maddr
__va / maddr_to_virt
virt_to_mfn
paddr_to_pfn
pfn_to_paddr
5
Agenda
Xen’s Architecture Terminology & Concepts Bootstrap Process Physical Memory Management Virtual Memory Management Domains Virtualization
6
gdt_table
Segmentation
gdt_descr / nopaging_gdt_descr Base: gdt_table –
FIRST_RESERVED_GDT_BYTE Limit: 15*4096-1 = 0xEFFF
Segment descriptors DPL0: hypervisor space DPL1: kernel space DPL3: user space TypeA: code segment Type2: data segment Base: 0x00000000 / Limit: 0xFFFFFF:
4GB flat segmentation Segment selectors
RPL: equal to DPL Index: offset from gdt_descr.base
gdt_descr.base
BaseType
A
Base Limit
PDPL
0SBase G D Limit
BaseType
A
Base Limit
PDPL
1SBase G D Limit
BaseType
A
Base Limit
PDPL
3SBase G D Limit
BaseType
2
Base Limit
PDPL
3SBase G B Limit
0x0000000000000000 (unused)
0x0000000000000000 (unused)
Base Address Low Limit (15*4096-1)
GDTR
Base Address High
FIRST_RESERVED_GDT_BYTE =FIRST_RESERVED_GDT_PAGE(14) * PAGE_SIZE(4096)
4*NR_CPUS * 8 (space for TSS and LDT per CPU)
NR_RESERVED_GDT_PAGES(1)
BaseType
2
Base Limit
PDPL
0SBase G B Limit
BaseType
2
Base Limit
PDPL
1SBase G B Limit
__HYPERVISOR_CS
__HYPERVISOR_DS
FLAT_KERNEL_CS
FLAT_KERNEL_DSFLAT_KERNEL_SS
FLAT_USER_CS
FLAT_USER_DSFLAT_USER_SS
RPL0
Index14*(4096/8)+1
RPL0
Index14*(4096/8)+2
RPL1
Index14*(4096/8)+3
RPL1
Index14*(4096/8)+3
RPL3
Index14*(4096/8)+4
RPL3
Index14*(4096/8)+5
0xe008
0xe010
0xe019
0xe021
0xe02b
0xe033
7
high mapping (12MB)
low mapping (4GB-64MB)
Early Paging
PS: Page Size extension (also known as PSE), if set, page size will be 4MBytes 0x3FC = PAGE_OFFSET >> 22, 0x3F0 = HYPERVISOR_VIRT_START >> 22, xen is linked at 0xFF000000 + 0x100000 (xen.lds.S) Pink: mapped are. Yellow: unmapped area Low mapping: lower virtual address = lower physical address, map up to HYPERVISOR_VIRT_START as maximum physical address is unknown High mapping: Xen codes’ address = lower physical address (DIRECTMAP with Xen code/data/heap )
Directory Offset
idle_pg_table
idle_pg_table
CR3
Physical Memory
Physical Address
CR0
PG
AM
ET
NE
WP
PE
MP
Base Address
CR4
PS
unmapped area (48MB)
unmapped are (4MB)0x3FF
PRW
ADPS
0x000=0x000
Page Directory Entry
0x3F0
PRW
ADPS
0x3FC=0x000
Direct Mapping [Xen code/data/heap] (12MB)
Start Physical Address (0x00000000)
End Physical Address (HYPERVISOR_VIRT_START, 0xFC000000)
8
Agenda
Xen’s Architecture Terminology & Concepts Bootstrap Process Physical Memory Management Virtual Memory Management Domains Virtualization
9
Xen Memory Layout
Overview Every OS’s address space at the top 64
MB of memory, to save a TLB flush. Xen’s address space
Entry: 0xFF000000 + 0x100000 PAGE_OFFSET: 0xFF000000
virt_to_maddr / __pa maddr_to_virt / __va
Memory Regions Direct Mapping: during early startup
phase, Xen initialized this area for its own code/data addressing.
Frame-info table: holding the frame tables for tracking machine pages.
Shadow & Guest linear page table: for guest’s linear address space virtualization. Available in guests.
Machine to physical mapping: holding the MFN -> GPFN translation.
IO Remapping: reserved area for ioremap
Per-domain mapping: reserved area for per-domain mapping
GDT/LDT tables: if guest want segmentation, it can register its own GDT/LDT tables here
mapping cache: map a page to Xen directly addressable area
IO Remapping (4MB)
Direct Mapping [Xen code/data/heap] (12MB)
Per-domain mapping (8MB)
Shadow linear page tables (4MB)
Guest linear page tables (4MB)
Machine-to-physical translation table [writable] (4MB)
Frame-info table (24MB)
Machine-to-physical translation table [read-only] (4MB)
Start Virtual Address (0xFC000000)
End Virtual Address (0xFFFFFFFF)
0xFFC00000
0xFF000000
0xFE800000
0xFE400000
0xFE000000
0xFDC00000
0xFC400000
0xFC000000
10
Domain Heap
Allocation Overview
Xen heap is up to physical address “xenheap_phys_end” (configurable, default is 0x00C00000=12MB)
RAM extent is determined by e820 map table. xenheap_phys_start then is determined.
Xen executable images extend xenheap_phys_start to “_end”
Modules (xen0 kernel & initial images) in multiboot table will be copied to the start position of xen heap.
init_boot_allocator: initialize area for early stage allocation with bitmap indicating allocation states.
init_boot_pages: allocation area allocable.
memguard_init: map xen heap pages at 4kB granularity, and protect xen heap with overflow detection.
init_frametable: initialize frame tables indicating page level allocation states
end_boot_allocator: MEMZONE_DMADOM & MEMZONE_DOM turned to be allocable.
init_xenheap_pages: MEMZONE_XEN turned to be allocable.
Xen heap (12MB)
Frame table (24MB)
Start Virtual Address (FRAMETABLE_VIRT_START)
Start Physical Address (0x00000000)
alloc_bitmap
pl1e[xen_heap]
pl1e[frame_table]
frame_table
Xen Heap (Final)
xenheap_phys_start
xenheap_phys_end
xenheap_phys_start
xenheap_phys_start
xenheap_phys_start
xen_image (_start ~ _end)
End Virtual Address (FRAMETABLE_VIRT_END)
initial_images
Frame-info Table
11
…
…… …
Allocable Physical Memory
Boot Allocation
Descriptions: Early boot stage allocation Allocation bitmap (alloc_bitmap):
Located in Xen heap area 0[white] means free 1[red] means allocated)
Boot allocation's allocable area: From: initial_images_end (physical
address > xenheap_phys_end) To: maximum of accessible
physical memory Intefaces:
Allocable area: alloc_boot_pages Xenheap area: xenheap_phys_start
+ allocatedi.e. alloc_xen_pagetable
Free Page (4KB)
……
……
Start Physical Address (0x00000000)
Maximum Physical Address
initial_images_end
xenheap_phys_end
Xen Heap
Initial Images
Allocable
alloc_bitmap
Allocated Page (4KB)
12
…
Frame Table
Address: FRAMETABLE_VIRT_START
Size: max_page * sizeof (page_info)
Align: PAGE_SIZE State: Red = allocated / White
= free Interfaces
alloc_heap_pages free_heap_pages [virt(xenheap) | maddr |
mfn]_to_page page_to_[virt(xenheap) |
maddr | mfn]
Allocable Physical Memory
max_pages
max_pages
tlbflush_timestamp type_info _domain count_info list
Maximum Physical Address
Start Physical Address (0x00000000)
tlbflush_timestamp cpumask order count_info list
frame_table
Xen Heap
frame_table
Domain Heap
…
Start Virtual Address (FRAMETABLE_VIRT_START)
End Virtual Address (FRAMETABLE_VIRT_END)
Used Area
Xen Heap
4KB
4KB …
tlbflush_timestamp type_info _domain count_info list
…
4KB
…
4KB
tlbflush_timestamp cpumask order count_info list
idle_pg_table_l2
PRW
AD0x3F1
…
13
Page Info Fields
count_info A(allocated): Cleared when the owning guest 'frees'
this page. OS(out-of-sync): Set when fullshadow mode marks a
page out-of-sync. PT(page-table): Set when fullshadow mode is using
a page as a page table. COUNT: 29-bit count of references to this frame.
type_info: TYPE: mutually exclusive types
PGT_l1_page_table PGT_l2_page_table PGT_gdt_page PGT_ldt_page PGT_writable_page
V(validated): Has this page been validated for use as its current type?
P(pinned): Owning guest has pinned this page to its current type?
VA: The 11 most significant bits of virt address if this is a page table.
COUNT: 16-bit count of uses of this frame as its current type.
list Each frame can be threaded onto a doubly-linked
list. Link to xenpage_list of <struct domain>: xen pages
shared with this domain. share_xen_page_with_guest
Link to page_list of <struct domain>: pages belonging to this domain. alloc_domain_pages
_domain owner of this page
order order-size of the free chunk this page is the head of
COUNTPV VATYPE
PT
OS
COUNTA
type_info
count_info
_domain list page_list_domain list
<domain><page_info><page_info>
_domain list xenpage_list_domain list
<domain><page_info><page_info>
14
Whole Physical Memory
Memory Zone
Attached into a global list for allocation static struct list_head heap[NR_ZONES]
[MAX_ORDER+1]; MEMZONE_XEN:
Safe in IRQ context always be freed by free_xenheap_pages or
xfree explicitly alloc_xenheap_pages
allocate pages in Xen heap Need not map to idle_pg_table_l2
free_xenheap_pages xmalloc / xfree
O(n) power of 2 free list allocator for arbitrary chunks allocation
MEMZONE_DMADOM & MEMZONE_DOM: Not safe in IRQ context
_DOMF_dying: it cares about the secrecy of pages’ contents
Will be scrubbed in softirq context pfn_dom_zone_type
distinguish between these 2 zones alloc_domheap_pages
try to allocate pages in MEMZONE_DOM try MEMZONE_DMADOM if MEMZONE_DOM
failed must remap to idle_pg_table_l2
free_domheap_pages Will check whether _DOMF_dying is set, if
so, page will be added to page_scrub_list, which will be freed in later softirq context
avail_domheap_pages get available page number for allocation
heap
MAX_ORDER
1
0
…
MAX_ORDER
1
0
…
MAX_ORDER
1
0
…
Xen Heap (xenheap_phys_end)
DMA Addressable Domain AreaDMA_BITS = 31 (0x80000000)
Domain Area
Maximum Physical Address
Start Physical Address (0x00000000)
15
Xen Heap
memguard_guard_range
memguard_guard_range
memguard_guard_range
Memory Guard
Descriptions Memory guard is used for xen
heap memory (Xen’s source codes will use this region) overflow detection
If a range is guarded, memory overflow will lead to a page fault exception (#PF)
Operations Init: Splitter 4MB mapped Xen
heap pages into 4KB mapping Guard: set pages to be present,
page is not in use Unguard: set pages to be not
present, page is in use Interfaces
memguard_guard_range, used in such functions:
init_xenheap_pages free_xenheap_pages
memguard_unguard_range, used in such functions:
alloc_xenheap_pages
memguard_unguard_range
memguard_unguard_range
memguard_guard_range
memguard_guard_range
memguard_unguard_range
……
memguard_guard_range
memguard_guard_range
#PF
Start Physical Address (xenheap_phys_start)
End Physical Address (xenheap_phys_end)
16
Agenda
Xen’s Architecture Terminology & Concepts Bootstrap Process Physical Memory Management Virtual Memory Management Domains Virtualization
17
Hypervisor Entries (0x010)
Domain Entries (0x3F0)
Xen Heap
Low Mapping (4GB-64MB)
Frame-info table (24MB)
Paging Overview
Entries L2_PAGETABLE_ENTRIES: 0x400 DOMAIN_ENTRIES_PER_L2_PAGETABLE: 0x3f0 HYPERVISOR_ENTRIES_PER_L2_PAGETABLE:
0x010 start_paging
Xen heap mapping Low mapping
memguard_init init_frametable domain_create (idle domain)
arch_domain_create paging_init
Machine to physical mapping IO remapping Per-domain page table
construct_dom0 Zap_low_mapping (idle_pg_table_l2)
Direct mapping (12MB)
idle_pg_table_l2
PRW
ADPS
0x3F7 G
PADPS
0x3F0 G
PRW
AD0x3F1
PRW
AD0x3FA
PRW
AD0x3FB
PRW
ADPS
0x000=0x000
PRW
ADPS
0x3FC=0x000 PRW
AD0x3FC=0x000
PRW
AD0x3FF
Xen heap L1 table (12KB)
Frame info L1 table (4KB * needed)
Per-domain GDT / LDT L1 table (4KB)
Per-domain mapping cache L1 table (4KB)
Xen Heap
Domain Heap
MPT mapping (4MB)
IO remap L1 table (4KB)
Start Physical Address (xenheap_phys_end)
Maximum Physical Address
Frame info table (<=24MB)
18
Per-domain L1 Page Table
Per-domain mapping (8MB)
Per-domain Page Table
L1_ENTRIES: arch_domain_create Size: 0x0800, 2 pages Location: Xen heap Contents: entries pointing to pages
L2_ENTRIES: construct_dom0 Size: 0x0002 Location: idle_pg_table_l2 Contents: entries pointing to
mm_perdomain_pt Per Domain
mm_perdomain_pt in <struct domain>
4KB L1 table for per-VCPU GDT/LDTs (4MB)
4KB L1 table for mapping cache (4MB)
Direct Mapping [Xen heap] (12MB)
End Virtual Address (PERDOMAIN_VIRT_END)
Start Physical Address (0x00000000)
Start Virtual Address (PERDOMAIN_VIRT_START)
End Physical Address (xenheap_phys_end)
mapcache.l1tab (0x400 ~ 7FF)
mm_perdomain_pt
idle_pg_table_l2
PRW
AD0x3FA
PRW
AD0x3FB
Per-domain GDT/LDTs (0x000 ~ 0x3FF)
19
Per-domain L1 Page Table
VCPU[0] per-domain page tables
Per-domain mapping (8MB)
GDT/LDT Tables
Location (per virtual CPU) perdomain_ptes in <struct vcpu>
Entries PDPT_L2_ENTRIES:
construct_dom0 Size: 0x0002 Location: idle_pg_table_l2 Contents: entries pointing to
mm_perdomain_pt PDPT_L1_ENTRIES:
arch_domain_create Size: 0x0800, 2 pages Location: Xen heap Contents: entries pointing to
pages Interfaces
GDT map / unmap set_gdt destroy_gdt
LDT map / unmap map_ldt_shadow_page invalidate_shadow_ldt
Direct Mapping [Xen heap] (12MB)
End Virtual Address (PERDOMAIN_VIRT_END)
Start Physical Address (0x00000000)
Start Virtual Address (PERDOMAIN_VIRT_START)
End Physical Address (xenheap_phys_end)
Per-domain GDT/LDTS(FIRST_RESERVED_GDT_PAGE = 14 pages)
gdt_table (0x00E)
…
VCPU[MAX_VIRT_CPUS] per-domain page tables
mm_perdomain_pt (0x000 ~ 0x3FF)
1<<GDT_LDT_VCPU_SHIFT
PDPT_L1_ENTRIES
Per-domain reservation
gdt_table (0x3EE)
Per-domain GDT/LDTs(FIRST_RESERVED_GDT_PAGE = 14 pages)
Per-domain reservation
20
Mapping Cache
Why? maddr_to_virt can only work for xenheap region in
idle_pg_table address space, so if we had a mfn or maddr to access, we could hardly translate it to a virtual address.
CONFIG_DOMAIN_PAGE indicates whether current arch is OK for using map_domain_page interfaces
How? Map domain pages to the mapping cache area which
is addressable in idle_pg_table address space Hash bitmap is used for acceleration 4MB can cache 1K pages mapping Install mapping into mapcache.l1tab
(mm_perdomain_pt + 0x400) Hash
Not in-use: MAPHASHENT_NOTINUSE = (u16)~0U In-use: idx from MAPCACHE_VIRT_START
Interfaces Per-VCPU mappings
map_domain_page unmap_domain_page
Accessible in all address spaces, can also be unmapped from any context.
map_domain_page_global unmap_domain_page_global
mapcache.vcpu_maphash
0
…
MAX_VIRT_CPU
Per-domain mapping (8MB)
End Virtual Address (PERDOMAIN_VIRT_END)
DOM Heap Pages
Start Virtual Address (MAPCACHE_VIRT_START)
MFN=0xC03
MFN=0x100E
…
mapcache.l1tab (0x0400 ~ 0x7FF)
mm_perdomain_pt
…
pfn idx refcount
21
Machine to Physical Mapping
MEMZONE_DOM
Machine-to-physical translation table [writable] (4MB)
…
Machine-to-physical translation table [read-only] (4MB)
End Virtual Address (RO_MPT_VIRT_END)
Start Physical Address (xenheap_phys_end)
idle_pg_table
PRW
ADPSE
0x3F7
PADPSE
0x3F0
…
…
…
Start Virtual Address (RDWR_MPT_VIRT_START)
Maximum Physical Address
MEMZONE_DMADOM
End Virtual Address (RDWR_MPT_VIRT_END)
Start Virtual Address (RO_MPT_VIRT_START)
Description Records the mapping from
machine page frames to pseudo-physical ones.
mpt_size Size: max_page *
BYTES_PER_LONG (max=4MB) Align: 1<<L2_PAGETABLE_SHIFT
(4MB) Location: anonymous domain
(domain == NULL) heap pages machine_to_phys_mapping
Address RDWR_MPT_VIRT_START RO_MPT_VIRT _START
Stores GPFN: guest pseudo-physical fra
me number INVALID_M2P_ENTRY: ~0UL 0x55555555 on initialization
Interfaces set_gpfn_from_mfn get_gpfn_from_mfn
0x55555555 0x55555555 …… ……
22
Page Table (4KB)
Paging
L2 entry: idle_pg_table_l2[l2_linear_offset(virt)], virt_to_xen_l2e(virt) L1 entry: l2e_to_l1e(l2e) + l1_table_offset(virt) map_pages_to_xen: map a physical page to idle_pg_table, specify MAP_SMALL_PAGES for 4KB mappings
PRW
ADpl2e page frame no 0x000
PRW
ADpl2e page frame no 0x3FF
L1 Page Table
Page Table Entry (pl1e)
Page Directory (4KB)
PRW
ADpl1e base address
L2 Page Table (idle_pg_table)
Page Directory Entry (pl2e)
PRW
ADpl1e base address
Physical Memory
Physical Address
Base Address
Directory OffsetTable
idle_pg_table
CR3 CR4
PS
23
Agenda
Xen’s Architecture Terminology & Concepts Bootstrap Process Physical Memory Management Virtual Memory Management Guest Memory Management Interrupt/Exception Handling
24
ACPI data
Reserved Domains
DOMID_IO: This domain owns I/O pages that are within the range of the page_info array.
First 1MB of RAM is historically marked as I/O.
Any areas not specified as RAM by the e820 map are considered I/O.
DOMID_XEN: Any Xen-heap pages that we will allow to be mapped will have their domain field set to this domain.
M2P table is mappable read-only by privileged domains.
Xen trace buffer is shared for xentrace.
IDLE_DOMAIN_ID: The idle domain will perform as an idle task after initialization is done.
No pages belonging to this domain.
usable
reserved
usable
ACPI NVS
reserved
Maximum Physical Address
Start Physical Address (0x00000000)
Historical 1MB IO
Reserved Area IO
M2P Table
Trace Buffer
25
To Be Allocated Pages
Domain0 Memory
Dom0 Memory Layout
nr_pages 1/16th of available memory for things
like DMA buffers. Maximum of 128MB. Specifiable.
Align v_end: 4MB alignment Other: 4KB alignment
Regions Loaded kernel: parse elf image load it
to the correct start address (for linux 0xC0000000).
Initial images: copy initial images after kernel.
Physical to machine mapping: alloc required pages for P2M mapping for pseudo physical address virtualization.
Start info: stores start information for domain in this page.
Page tables: L2 (1 page) & L1 page tables for this mapping. nr_pages(vpt) > l1 & l2 page tables * PAGE_SIZE
Boot stack: reserved for bootup stack v_end: reserved at least 512KB.
Kernel Image
Start Virtual Address (domain_setup_info.v_start)
Init. ramdisk
Physical to Machine Mapping
Start Info (1 page)
Page Tables (nr_pt_pages)
Boot Stack
…
End Virtual Address (v_end)
nr_pages
26
Linear Page Table
Steps Allocate required page table pages
L2: 1 page, set as PGT_l2_page_table L1: no, of pages containing entries can cover
(v_end - v_start), set as PGT_l1_page_table type_info of page frame is set to be
PGT_writable_page Copy idle_pg_table_l2 Set LINEAR_PT_VIRT_START entry pointing to
page table itself Overwrite PERDOMAIN_VIRT_START entries Set v_start ~ v_end page tables
Fields All page table pages marked as read only to
ensure not to be modified by guest OS Page Tables
guest_table, pfn of l2 entry offset 0x3f8
Page Tables vpt_start ~ vpt_end (nr_pt_pages)
Frame-info table (24MB)
Direct mapping (12MB)
idle_pg_table_l2
PRW
ADPS
0x3F7 G
PRW
AD0x3F1
PRW
AD0x3FA
PRW
AD0x3FB
PRW
AD0x3FC=0x000
PRW
AD0x3FF
l1start (4KB)
l1start+1 (4KB)
PAD0x2fc
PAD0x2fd
guest_table
PRW
AD0x3F8
PADPS
0x3F0 G
Start Virtual Address (linux entry point 0xC0000000)
27
Physical to Machine Mapping
Description Records the mapping
from pseudo-physical frames to machine page ones.
Address: vpt_start ~ vpt_end
construct_dom0 Allocate all domain0
reservation Fill P2M & M2P
mappings vphysmap_start[pfn] =
mfn; set_gpfn_from_mfn(mfn
, pfn);
Reserved Area
Start Virtual Address (domain_setup_info.v_start)
Startup Area
End Virtual Address (v_end)
Physical to Machine Mapping
nr_pages
Machine-to-physical translation table [writable] (4MB)
…
Machine-to-physical translation table [read-only] (4MB)
End Virtual Address (RO_MPT_VIRT_END)
Start Virtual Address (RDWR_MPT_VIRT_START)
End Virtual Address (RDWR_MPT_VIRT_END)
Start Virtual Address (RO_MPT_VIRT_START)
28
Start Info
29
DomU Builder
Similar to dom0 builder. Can be called from domain 0 (control domain) in
libxc: xc_linux_build.
30
Writable Pages
31
Shadow Page Table
32
Grant Table
33
Agenda
Xen’s Architecture Terminology & Concepts Bootstrap Process Physical Memory Management Virtual Memory Management Guest Memory Management Interrupt/Exception Handling
34
Interrupt Table
Stores Address: idt_table Size: IDT_ENTRIES (256)
Fields DPL: privilege level Offset: trap function address Segment Selector: use hypervisor code
segment (D = 1) means size of (gate = 32)
Entries Gray: DPL=0 (hypervisor traps) Blue: DPL=3 (system wide gate) Green: DPL=1 (allow kernel trap in hypervisor) Yellow: DPL=0 type=task gate Pink: per-CPU interrupt gate
Interfaces: Gray: set_intr_gate Blue: set_system_gate Yellow: set_task_gate Others: _set_gate
0 TRAP_divide_error divide_error
1 TRAP_debug debug
2 TRAP_nmi nmi
3 TRAP_int3 int3
4 TRAP_overflow overflow
5 TRAP_bounds bounds
6 TRAP_invalid_op invalid_op
7 TRAP_no_device device_not_available
8 TRAP_double_fault __DOUBLEFAULT_TSS_ENTRY
9 TRAP_copro_seg coprocessor_segment_overrun
10 TRAP_invalid_tss invalid_TSS
11 TRAP_no_segment segment_not_present
12 TRAP_stack_error stack_segment
13 TRAP_gp_fault general_protection
14 TRAP_page_fault page_fault
15 TRAP_spurious_int spurious_interrupt_bug
16 TRAP_copro_error coprocessor_error
17 TRAP_alignment_check alignment_check
18 TRAP_machine_check machine_check
19 TRAP_simd_error simd_coprocessor_error
31 TRAP_deferred_nmi deferred_nmi
0x82 HYPERCALL_VECTOR hypercallidt_table
Base Address Low Limit (256*8-1)
IDTR
Base Address High
IDT_ENTRIES (256)
D
Segment Selector(__HYPERVISOR_CS << 16)
Offset High
Offset Low
P DPL
Fields <-
35
Exception Handler
Allocate structure on stack Processor saves specific registers Error code
May be stored by processor on specific exception
Entry vector store by exception handler
Store other registers by ‘error_code’ codes
If CS==3, exception handler saves selectors
Stack top (functions’ parameter) points cpu_user_regs
EBX points to current vcpu
ebx
ecx
edx
esi
edi
ebp
eax
error_code entry_vector
upcall_mask
eip
cs _pad0
eflags
esp
ss _pad1
es _pad2
ds _pad3
fs _pad4
gs _pad5
cpu_user_regs
Stack Top
ebx
STACK
struct vcpu
cpu_user_regs
Start Virtual Address (vcpu[n] stack)
36
Hyper Call
37
Event Channel
38
Virtual IDT
Description Virtual IDT propagates exceptions to the guest
OS. Location
Stored in the vcpu’s guest context field Init values are in global trap_table Stored in the vcpu’s guest context through
hypercall Size
Maximum to 256 entries Sequences
Initialize CS field for guest as FLAT_KERNEL_CS
construct_dom0 Set init vcpu’s virtual IDT handlers
(linux)trap_init (linux)HYPERVISOR_set_trap_table do_set_trap_table (NULL: clear entire virtual
IDT / Not NULL: set virtual IDT) Initialize trap_ctxt for every virtual CPU
(linux)smp_trap_init (linux)HYPERVISOR_vcpu_op(VCPUOP_initia
lise) do_vcpu_op(VCPUOP_initialise): copy virtual
IDT Restrict code selector for guest virtual IDT
fixup_guest_code_selector Propagate exceptions
do_xxx (trap handlers): turn safe exception tor trap_bounce
create_bounce_frame: create a basic exception frame on guest OS (RING-1) stack
flagscs vector
address
…
flagscs vector
address
trap_table
256
39
Dynamic IRQ
40
Current Task
41
Range Set
Used for following system resources’ management Interrupts IO Ports IO Memory