uqc152h3 advanced os memory management under linux

22
UQC152H3 Advanced OS Memory Management under Linux

Post on 19-Dec-2015

235 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: UQC152H3 Advanced OS Memory Management under Linux

UQC152H3 Advanced OS

Memory Management under Linux

Page 2: UQC152H3 Advanced OS Memory Management under Linux

Memory Addressing

• Segmentation and paging background• Segmentation

– Intel segmentation hardware– Linux use of Intel segmentation

• Paging– Intel paging hardware– Linux use of Intel paging

• Memory layout– Kernel layout in physical memory– Process virtual memory layout– Kernel virtual memory layout

Page 3: UQC152H3 Advanced OS Memory Management under Linux

Segmentation Background

• Motivated by logical use of memory areas– Code, heap, stack, etc.

• Base + offset– Segment registers contain base

• Segments variable size (usually large)• Process as collection of segments

– No notion of linear, contiguous memory– Similar to “multi-stream” files (Mac)

• Requires “segment table”

Page 4: UQC152H3 Advanced OS Memory Management under Linux

Paging Background

• Motivated by notion of linear, contiguous, "virtual" memory (space)– Every process has it's own "zero" address

• Uniform sized chunks (pages – 4K on Intel)• Virtually contiguous pages may be physically

scattered• Virtual space may have "holes"• Page table translates virtual "pages" to physical

"page frames"

Page 5: UQC152H3 Advanced OS Memory Management under Linux

Intel Segmentation/Paging

• Intel address terminology:– Logical – segment + offset– Linear (virtual) – 0 .. 4GB (64GB with PAE)– Physical

• Logical -> Linear -> Physical

• Paging can be disabled

• Segmentation required – Though you can just have one big segment

Page 6: UQC152H3 Advanced OS Memory Management under Linux

Intel Segmentation Hardware

• Segment registers: cs, ss, ds, es, fs, gs– Indices ("selectors") into "segment descriptor tables"

• Segment descriptor tables: GDT, LDT– Global and local (per process, in theory)– Each holds about 8000 descriptors

• Segment descriptors: 8 bytes each– Base/limit, code/data privileges, type– Cached in ro registers when seg registers loaded

• Task segment descriptor (TSS) – Special segment for holding process context

Page 7: UQC152H3 Advanced OS Memory Management under Linux
Page 8: UQC152H3 Advanced OS Memory Management under Linux

Segmentation in Linux

• History– Early versions segmented; now paged– Using "shared" segments simplifies things– Some RISC chips don't support segmentation

• Linux only uses GDT– LDTs allocated for Windows emulation– Each process has TSS and LDT descriptors– TSS is per-process; LDT is shared

Page 9: UQC152H3 Advanced OS Memory Management under Linux

Linux Descriptor Allocation

• GDT holds 8192 descriptors– 4 primary, 4 APM, 4 unused– 8180 left / 2 -> 4090 processes (limit in 2.2)

• Primary (shared) segments– Kernel code, kernel data– User code, user data– Segments overlap in linear address space

• 2.4 removes 4K process restriction

Page 10: UQC152H3 Advanced OS Memory Management under Linux
Page 11: UQC152H3 Advanced OS Memory Management under Linux

Intel Paging Hardware

• Page table "maps" linear address space– Some pages may be invalid– Address space grows/shrinks (mapping)

• New regions mapped by DLLs, mmap(), brk()

– Pages (linear) vs. page frames (physical)• Page may map to different frame after swap

– Page tables stored in kernel data segment

• Intel page size: 4K or 4M ("jumbo pages")– Jumbo pages reduce table size

• Paging "enabled" by writing cr0 register

Page 12: UQC152H3 Advanced OS Memory Management under Linux

Intel Two-Level Paging

• "Page table" is actually a two-level tree– Page Directory (root)– Page Tables (children)– Pages (grandchildren)

• Linear address carved into three pieces– Directory (10), Table (10), Offset (12)– Entries: frame # + bookkeeping bits

• Bookkeeping bits:– Present, Accessed, Dirty– Read/Write, User/Supervisor, Page Size

• Jumbo pages – Map entire 4GB address space with just top-level Directory– No need for Tables! (Kernel uses this technique)

Page 13: UQC152H3 Advanced OS Memory Management under Linux

Three-Level Paging

• How big are page tables on 64 bit arch?– Sparc, Alpha, Itanium– Assume 16K pages => 32 M per process!

• Better idea: three-level paging trees– Page Global Directory (pgd)– Page Middle Directory (pmd)– Page Table (pt)

• Carve linear address into 4 pieces• Conceptual paging model

– On Intel, page middle directory is compiled out

Page 14: UQC152H3 Advanced OS Memory Management under Linux

Aside: Caching

• Exploit temporal and spatial locality• L1 and L2 caches (on chip)• Cache "lines" (32 bytes on Intel)• Kernel developer goals:

– Keep frequently used struct fields in the same cache line!

– Spread lines out for large data structures to keep cache utilized

– The "Cache Police"

Page 15: UQC152H3 Advanced OS Memory Management under Linux

TLB

• Translation Look-aside Buffer– Virtual to physical cache

• Must be flushed (invalidated) when address space mappings change

• A significant cost of context switch

Page 16: UQC152H3 Advanced OS Memory Management under Linux

Paging in Linux

• Three-level scheme– Middle directories "collapsed" w/ macro tricks– No bits assigned for middle directory

• On context switch (address space change)– Save/load registers in TSS– Install new top-level directory (write cr3)– Flush TLB

• Lot's of macros for:– Allocating/deallocating, altering, querying…– Page directories, tables, entries– Examples:

• set_pte(), mk_pte(), pte_young(), new_page_tables()

Page 17: UQC152H3 Advanced OS Memory Management under Linux

Kernel Physical Layout

• Kernel is "pinned" (non-swappable)• Usually about 1M; mapped in first 2M • Some "holes" in low memory

– Zero frame always invalid (avoids null dereference)– ISA "i/o aperture" (640K .. 1M)

• Kernel layout symbols (System.map)– _text (code start) (usually about 1M)– _etext (initialized data start)– _edata (uninitialized data start)– _end (end of kernel data)

• Remaining frames allocated as needed by kernel page allocator

Page 18: UQC152H3 Advanced OS Memory Management under Linux
Page 19: UQC152H3 Advanced OS Memory Management under Linux

Process Virtual Layout

• 4GB virtual space on a 32 bit system– Possible to install Linux on systems with more

than 4GB of physical memory but you must use tricks to access "high" memory

• Kernel macro PAGE_OFFSET– Division between user and kernel regions– Typically 3GB user, 1 GB kernel (adjustable)

• We will look at details of user space later

Page 20: UQC152H3 Advanced OS Memory Management under Linux

Kernel Virtual Layout

• Kernel page tables initialized in two stages during startup– Phase one: only maps 4MB (provisional)– Phase two: maps all memory

• Provisional– Static PGD, just one PT– swapper_pg_dir– Low 4M of user & kernel map low 4M physical– Allows kernel to be addressed virtually or physically

Page 21: UQC152H3 Advanced OS Memory Management under Linux

Final Kernel Virtual Layout

• Kernel code – Operates using linear (virtual) addresses– Macros to go from physical to virtual and back

• _pa(virual), _va(physical)

• Kernel mapped using jumbo pages• Kernel virtual address space

– Low physical frames (containing kernel) mapped to virtual addresses starting at PAGE_OFFSET

– Remaining frames allocated on demand by kernel and user processes and mapped to virtual addresses

Page 22: UQC152H3 Advanced OS Memory Management under Linux

Summary

• Intel hybrid segmented/paged architecture

• Segment support used minimally

• Linux has a conceptual 3 level page table

• Kernel is mapped into high memory of every process address space but is stored in low physical memory