lecture 7 tlb. virtual memory approaches time sharing static relocation base base+bounds...

Post on 19-Jan-2016

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture 7TLB

Virtual Memory Approaches• Time Sharing• Static Relocation• Base• Base+Bounds• Segmentation• Paging

Basic Paging

• Flexible Addr Space• don’t need to find contiguous RAM• doesn’t waste whole data pages (valid bit)

• Easy to manage• fixed size pages• simple free list for unused pages• no need to coalesce

• Too slow• Too big

Page Mappingwith Linear Page Table

VirtMem

PhysMem

P1 P2 P3

0 1 2 3 4 5 6 7 8 9 10 11

Page Tables

3P1

17

10

0P2

426

8P3

59

11

Where are Page Tables Stored?• The size of a typical page table?• assume 32-bit address space• assume 4 KB pages• assume 4 byte entries (or this could be less)• 2 ^ (32 - log(4KB)) * 4 = 4 MB

• Store in memory, and CPU finds it via registers

Memory AccessesPT, load from 0x5000Fetch instruction at 0x2010PT, load from 0x5004Exec, load from 0x0100…

0x0010 movl 0x1100, %r8d0x0014 addl $0x3, %r8d0x0017 movl %r8d, 0x1100

2PT

08099

Assume 4KB pagesAssume PTBR is 0x5000Assume PTE’s are 4 bytes

TOO SLOW

Other Information in Page Table• What other data should go in page table entries

besides translation?• valid bit• protection bits• present bit• reference bit• dirty bit

// Extract the VPN from the virtual address

VPN = (VirtualAddress & VPN_MASK) >> SHIFT

// Form the address of the page-table entry (PTE)

PTEAddr = PTBR + (VPN * sizeof(PTE))

// Fetch the PTE

PTE = AccessMemory(PTEAddr)

// Check if process can access the page

if (PTE.Valid == False)

RaiseException(SEGMENTATION_FAULT)

else if (CanAccess(PTE.ProtectBits) == False)

RaiseException(PROTECTION_FAULT)

else // Access is OK: form physical address and fetch it

offset = VirtualAddress & OFFSET_MASK

PhysAddr = (PTE.PFN << PFN_SHIFT) | offset

Register = AccessMemory(PhysAddr)

Translation Steps

H/W: for each mem reference:1. extract VPN (virt page num) from VA (virt addr)2. calculate addr of PTE (page table entry)3. fetch PTE4. extract PFN (page frame num)5. build PA (phys addr)6. fetch PA to register

A Memory Traceint array[1000];...for (i = 0; i < 1000; i++) array[i] = 0;

0x1024 movl $0x0,(%edi,%eax,4)0x1028 incl %eax0x102c cmpl $0x03e8,%eax0x1030 jne 0x1024

Array Iterator

int sum = 0;

for (i = 0; i < 10; i++) {

sum += a[i];

}

• What is the memory trace?

Basic strategy

• Take advantage of repetition.• Use a CPU cache.

CPU

TLB

RAM

PT

TLB Cache Type

• Fully-Associative: entries can go anywhere• most common for TLBs• must store whole key/value in cache• search all in parallel

• There are other general cache types

TLB Contents

VPN | PFN | other bits• TLB valid bit• whether the entry has a valid translation

• TLB protection bits• rwx

• Address Space Identifier• TLB dirty bit

A MIPS TLB Entry

1 VPN = (VirtualAddress & VPN_MASK) >> SHIFT

2 (Success, TlbEntry) = TLB_Lookup(VPN)

3 if (Success == True) // TLB Hit

4 if (CanAccess(TlbEntry.ProtectBits) == True)

5 Offset = VirtualAddress & OFFSET_MASK

6 PhysAddr = (TlbEntry.PFN << SHIFT) | Offset

7 AccessMemory(PhysAddr)

8 else

9 RaiseException(PROTECTION_FAULT)

10 else // TLB Miss

11 PTEAddr = PTBR + (VPN * sizeof(PTE))

12 PTE = AccessMemory(PTEAddr)

13 if (PTE.Valid == False)

14 RaiseException(SEGMENTATION_FAULT)

15 else if (CanAccess(PTE.ProtectBits) == False)

16 RaiseException(PROTECTION_FAULT)

17 else

18 TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits)

19 RetryInstruction()

Array Iterator with TLB

int sum = 0;

for (i = 0; i < 10; i++) {

sum += a[i];

}

How many TLB hits?How many TLB misses?Hit rate?Miss rate?

Reasoning about TLB

• Workload: series of loads/stores to accesses

• TLB: chooses entries to store in CPU

• Metric: performance (i.e., hit rate)

TLB Workloads

• Spatial locality• Sequential array accesses can almost always hit in the

TLB, and so are very fast!

• Temporal locality

• What pattern would be slow?• highly random, with no repeat accesses

TLB Replacement Policies

• LRU: evict least-recently used a TLB slot is needed

• Random: randomly choose entries to evict

• When is each better?• Sometimes random is better than a “smart” policy!

Who Handles The TLB Miss?• H/W or OS?

• H/W: CPU must know where page tables are• CR3 on x86• Page table structure not flexible

• OS: CPU traps into OS upon TLB miss

1 VPN = (VirtualAddress & VPN_MASK) >> SHIFT

2 (Success, TlbEntry) = TLB_Lookup(VPN)

3 if (Success == True) // TLB Hit

4 if (CanAccess(TlbEntry.ProtectBits) == True)

5 Offset = VirtualAddress & OFFSET_MASK

6 PhysAddr = (TlbEntry.PFN << SHIFT) | Offset

7 AccessMemory(PhysAddr)

8 else

9 RaiseException(PROTECTION_FAULT)

10 else // TLB Miss

11 PTEAddr = PTBR + (VPN * sizeof(PTE))

12 PTE = AccessMemory(PTEAddr)

13 if (PTE.Valid == False)

14 RaiseException(SEGMENTATION_FAULT)

15 else if (CanAccess(PTE.ProtectBits) == False)

16 RaiseException(PROTECTION_FAULT)

17 else

18 TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits)

19 RetryInstruction()

1 VPN = (VirtualAddress & VPN_MASK) >> SHIFT

2 (Success, TlbEntry) = TLB_Lookup(VPN)

3 if (Success == True) // TLB Hit

4 if (CanAccess(TlbEntry.ProtectBits) == True)

5 Offset = VirtualAddress & OFFSET_MASK

6 PhysAddr = (TlbEntry.PFN << SHIFT) | Offset

7 AccessMemory(PhysAddr)

8 else

9 RaiseException(PROTECTION_FAULT)

10 else // TLB Miss

11 RaiseException(TLB_MISS)

OS TLB Miss Handler

• OS: CPU traps into OS upon TLB miss1. check page table for page table entry2. if valid, extract PFN and update TLB w special inst3. return from trap

• Where to resume execution?• The instruction that caused the trap

• How to avoid double traps?• keep TLB miss handlers in physical memory• reserve some entries in the TLB for permanently-valid

translations

• Modifying TLB entries is privileged

Context Switches

• What happens if a process uses the cached TLB entries from another process?

• Solutions?• Flush TLB on each switch• Remember which entries are for each process• Address Space Identifier

Address Space Identifier

• You can think of the ASID as a process identifier (PID), but usually it has fewer bits

3

P1(ASID 11)

17

10

0

P2(ASID 12)

426

valid VPN PFN ASID

0 - - -

1 1 1 ?

1 1 4 ?

1 0 3 ?

Next time:solving the too big problems

top related