lecture topics: 11/17 page tables –flat page tables –paged page tables –inverted page tables...

27
Lecture Topics: 11/17 • Page tables – flat page tables – paged page tables – inverted page tables • TLBs • Virtual memory

Upload: alaina-parks

Post on 02-Jan-2016

227 views

Category:

Documents


1 download

TRANSCRIPT

Lecture Topics: 11/17

• Page tables– flat page tables– paged page tables– inverted page tables

• TLBs• Virtual memory

Virtual Addresses

• Programs use virtual addresses which don't correlate to physical addresses

• CPU translates all memory references from virtual addresses to physical addresses

• OS still uses physical addresses

CPUTranslation

BoxMain

Memory

physical address

CPUTranslation

Boxvirtual address

physical address

Main

MemoryUser mode:

Kernel mode:

Paging• Divide a process's

virtual address space into fixed-size chunks (called pages)

• Divide physical memory into pages of the same size

• Any virtual page can be located at any physical page

• Translation box converts from virtual pages physical pages

012345

0123

012345678910111213

Translation

Word

IE5

Virtual Page #

Physical Page #

0x0000

0x6000

0x0000

0x4000

0x0000

0xE000

Page Tables

• A page table maps virtual page numbers to physical page numbers

• Lots of different types of page tables– arrays, lists, hashes

Page

Table

Virtual

Page #

Physical

page #

virtual addressVPN

Offset

physical address

PPN

Offset

Process ID

Flat Page Table

• A flat page table uses the VPN to index into an array

• What's the problem? (Hint: how many entries are in the table?)

VPN

PPN

Page Table

56213109

012345678910111213

012345

VPN Offset

4 100

Memory

Flat Page Table Evaluation

• Very simple to implement• Flat page tables don't work for sparse

address spaces– code starts at 0x00400000– stack starts at 0x7FFFFFFF

• With 8K pages, this requires 1MB of memory per page table – 1MB per process– must be kept in main memory (can't be put

on disk)

• 64-bit addresses are a nightmare (4 TB)

Multi-level Page Tables

• Use multiple levels of page tables– each page table points to another

page table– the last page table points to the PPN

• The VPN is divided into – Index into level 1 page– Index into level 2 page

Multi-level Page Tables

012345678910111213

L1 Page Table

NO

VPN Offset

3 1002

0123

MemoryL2Page Tables

Multi-Level Evaluation

• Only allocate as many page tables as we need--works with the sparse address spaces

• Only the top page table must be in pinned in physical memory

• Each page table usually fills exactly 1 page so it can be easily moved to/from disk

• Requires multiple physical memory references for each virtual memory reference

Inverted Page Tables

• Inverted page tables hash the VPN to get the PPN

• Requires O(1) lookup

• Storage is proportional to number of physical pages being used not the size of the address space

Hash

Table

VPN Offset

Inverted Page Table Memory

Translation Problem

• Each virtual address reference requires multiple accesses to physical memory

• Physical memory is 50 times slower than accessing the on-chip cache

• If the VPN->PPN translation was made for each reference, the computer would run as fast as a Commodore-64

• Fortunately, locality allows us to cache translations on chip

Translation Lookaside Buffer

• The translation lookaside buffer (TLB) is a small on-chip cache of VPN->PPN translations

• In common case, translation is in the TLB and no need to go through page tables

• Common TLB parameters– 64 entries– fully associative– separate data and instruction TLBs (why?)

Virtual Page # Physical Page # Control Info

11 6 valid, read/write

200 13 valid, read only

-- -- invalid

0 14 valid, read/write

TLB

• On a TLB miss, the CPU asks the OS to add the translation to the TLB– OS replacement policies are usually

approximations of LRU

• On a context switch all TLB entries are invalidated because the next process has different translations

• A TLB usually has a high hit rate 99-99.9%– so virtual address translation doesn't cost

anything

Virtual Memory• Virtual memory spills unused memory to disk

– abstraction: infinite memory– reality: finite physical memory

• In computer science, virtual means slow– think Java Virtual Machine

• VM was invented when memory was small and expensive– needed VM because memories were too small– 1965-75 CPU=1 MIPS, 1MB=$1000, disk=30ms

• Now cost of accessing is much more expensive– 2000 CPU=1000 MIPS, 1MB=$1, disk=10ms– VM is still convenient for massive multitasking, but

few programs need more than 128MB

Virtual Memory

• Simple idea: page table entry can point to a PPN or a location on disk (offset into page file)

• A page on disk is swapped back in when it is referenced– page fault

VPN

012345678910

0123456

memory

0123456

page file

Page Fault ExampleVPN

012345678910

0123456

memory

0123456

page file

VPN

012345678910

0123456

memory

0123456

page file

VPN

012345678910

0123456

memory

0123456

page file

Reference to VPN 10 causes a page fault because it is on disk.

VPN 5 has not been used recently. Write it to the page file.

Read VPN 10 from the page file into physical memory.

Virtual Memory vs. Caches

• Physical memory is a cache of the page file

• Many of the same concepts we learned with caches apply to virtual memory– both work because of locality– dirty bits prevent pages from always being

written back

• Some concepts don't apply– VM is usually fully associative with

complex replacement algorithms because a page fault is so expensive

Replacement Algorithms

• How do we decide which virtual page to replace in memory?

• FIFO--throw out the oldest page– very bad because throws out frequently used pages

• RANDOM--pick a random page– works better than you would guess, but not good enough

• MIN--pick the page that won't be used for the longest time– provably optimal, but impossible because requires

knowledge of the future

• LRU--approximation of MIN, still impractical• CLOCK--practical approximation of LRU

Perfect LRU

• Perfect LRU– timestamp each page when it is

referenced– on page fault, find oldest page– too much work per memory reference

LRU Approximation: Clock

• Clock algorithm– arrange physical pages in a circle, with a

clock hand– keep a use bit per physical page– bit is set on each reference

• bit isn't set page not used in a long time

– On page fault• Advance clock hand to next page & check use

bit– If used, clear the bit and go to next page– If not used, replace this page

Clock Example

0

1

1

10

0

1

0

0

1

2

34

5

6

7

0

1

1

00

0

1

0

0

1

2

34

5

6

7

0

1

0

00

0

1

0

0

1

2

34

5

6

7

0

0

0

00

0

1

0

0

1

2

34

5

6

7

1

0

0

00

0

1

0

0

1

2

34

5

6

7

PPN 0 has been used; clear and advance

PPN 1 has been used; clear and advance

PPN 2 has been used; clear and advance

PPN 3 has been not been used; replace and set use bit

Clock Questions

• Will Clock always find a page to replace?

• What does it mean if the hand is moving slowly?

• What does it mean if the hand is moving quickly?

Thrashing

• Thrashing occurs when pages are tossed out, but are still needed– listen to the harddrive crunch

• Example: a program touches 50 pages often but only 40 physical pages

• What happens to performance?– enough memory 2 ns/ref (most refs hit in cache)– not enough memory 2 ms/ref (page faults every few

instructions)

• Very common with shared machinesjobs/sec

# users

thrashing

Thrashing Solutions

• If one job causes thrashing– rewrite program to have better

locality

• If multiple jobs cause thrashing– only run processes that fit in memory

• Big red button

Working Set

• The working set of a process is the set of pages that it is actually using– usually much smaller than the amount of

memory that is allocated

• As long as a process's working set fits in memory it won't thrash

• Formally: the set of pages a job has referenced in the last T seconds

• How do we pick T?– too big => could run more programs– too small => thrashing

What happens on a memory reference?

• An instruction refers to memory location X:• Is X's VPN in the TLB?

– Yes: get data from cache or memory. Done.– (Often don't look in TLB if data is in the L1 cache)

• Trap to OS to load X's VPN into the cache• OS: Is X's VP located in physical memory?

– Yes: replace TLB entry with X's VPN. Return control to CPU, which restarts the instruction. Done.

• Must load X's VP from disk– pick a page to replace, write it back to disk if dirty– load X's VP from disk into physical memory– Replace the TLB entry with X's VPN. Return control to

CPU, which restarts the instruction.

What is a Trap?

• http://www.cs.wayne.edu/~tom/guide/os2.html

• http://www.cs.nyu.edu/courses/fall99/G22.2250-001/class-notes.html