caching and tlbs
DESCRIPTION
Caching and TLBs. Andy Wang Operating Systems COP 4610 / CGS 5765. Caching. Stores copies of data at places that can be accessed more quickly than accessing the original Speeds up access to frequently used data At a cost: slows down the infrequently used data. Caching in Memory Hierarchy. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/1.jpg)
Caching and TLBs
Andy Wang
Operating Systems
COP 4610 / CGS 5765
![Page 2: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/2.jpg)
Caching
Stores copies of data at places that can be accessed more quickly than accessing the originalSpeeds up access to frequently used
dataAt a cost: slows down the infrequently
used data
![Page 3: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/3.jpg)
Caching in Memory Hierarchy
Provides the illusion of GB storageWith register access time
Access Time Size Cost
Primary memory Registers 1 clock cycle ~500 bytes On chip
Cache 1-2 clock cycles <10 MB
Main memory 1-4 clock cycles < 8 GB $10/GB
Secondary memory Disk 5-50 msec < 5 TB $50/TB
![Page 4: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/4.jpg)
Caching in Memory Hierarchy Exploits two hardware characteristics
Smaller memory provides faster access times
Large memory provides cheaper storage per byte
Puts frequently accessed data in small, fast, and expensive memory
Assumption: non-random program access behaviors
![Page 5: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/5.jpg)
Locality in Access Patterns
Temporal locality: recently referenced locations are more likely to be referenced soone.g., files
Spatial locality: referenced locations tend to be clusterede.g., listing all files under a directory
![Page 6: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/6.jpg)
Caching
Does not work well for programs with little localitiese.g., scanning the entire disk
• Leaves behind cache content with no localities (cache pollution)
![Page 7: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/7.jpg)
Generic Issues in Caching
Effective metricsCache hit: a lookup is resolved by
the content stored in cacheCache miss: a lookup is resolved
elsewhere Effective access time
= P(hit)*(hit_cost) + P(miss)*(miss_cost)
![Page 8: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/8.jpg)
Effective Access Time
Cache hit rate: 99%Cost of checking: 2 clock cycles
Cache miss rate: 1%Cost of going elsewhere: 4 clock
cycles Effective access time:
99%*2 + 1%*(2 + 4) = 1.98 + 0.06 = 2.04 (clock cycles)
![Page 9: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/9.jpg)
Reasons for Cache Misses
Compulsory misses: data brought into the cache for the first timee.g., booting
Capacity misses: caused by the limited size of a cacheA program may require a hash table
that exceeds the cache capacity• Random access pattern• No caching policy can be effective
![Page 10: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/10.jpg)
Reasons for Cache Misses
Misses due to competing cache entries: a cache entry assigned to two pieces of dataWhen both activeEach will preempt the other
Policy misses: caused by cache replacement policy, which chooses which cache entry to replace when the cache is full
![Page 11: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/11.jpg)
Design Issues of Caching
How is a cache entry lookup performed?
Which cache entry should be replaced when the cache is full?
How to maintain consistency between the cache copy and the real data?
![Page 12: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/12.jpg)
Caching Applied to Address Translation
Process references the same page repeatedlyTranslating each virtual address to
physical address is wasteful Translation lookaside buffer (TLB)
Tracks frequently used translationsAvoids translations in the common
case
![Page 13: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/13.jpg)
Caching Applied to Address Translation
Virtual addresses
Physicaladdresses
Data reads or writes(untranslated)
TLB
Translation table
In TLB
![Page 14: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/14.jpg)
Example of the TLB Content
Virtual page number (VPN) Physical page number (PPN) Control bits
2 1 Valid, rw
- - Invalid
0 4 Valid, rw
![Page 15: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/15.jpg)
TLB Lookups
Sequential search of the TLB table Direct mapping: assigns each virtual
page to a specific slot in the TLBe.g., use upper bits of VPN to index
TLB
![Page 16: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/16.jpg)
Direct Mapping
if (TLB[UpperBits(vpn)].vpn == vpn) {
return TLB[UpperBits(vpn)].ppn;
} else {
ppn = PageTable[vpn];
TLB[UpperBits(vpn)].control = INVALID;
TLB[UpperBits(vpn)].vpn = vpn;
TLB[UpperBits(vpn)].ppn = ppn;
TLB[UpperBits(vpn)].control = VALID | RW
return ppn;
}
![Page 17: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/17.jpg)
Direct Mapping
When use only high order bitsTwo pages may compete for the same
TLB entry• May toss out needed TLB entries
When use only low order bitsTLB reference will be clustered
• Failing to use full range of TLB entries
Common approach: combine both
![Page 18: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/18.jpg)
TLB Lookups
Sequential search of the TLB table Direct mapping: assigns each virtual
page to a specific slot in the TLBe.g., use upper bits of VPN to index
TLB Set associativity: uses N TLB banks
to perform lookups in parallel
![Page 19: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/19.jpg)
Two-Way Associative Cache
VPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
hash
= =
![Page 20: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/20.jpg)
Two-Way Associative Cache
VPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
hash
= =
![Page 21: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/21.jpg)
Two-Way Associative Cache
VPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
VPN PPN
hash
= =
If miss, translate and replace one of the entries
![Page 22: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/22.jpg)
TLB Lookups
Direct mapping: assigns each virtual page to a specific slot in the TLBe.g., use upper bits of VPN to index
TLB Set associativity: use N TLB banks
to perform lookups in parallel Fully associative cache: allows
looking up all TLB entries in parallel
![Page 23: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/23.jpg)
Fully Associative Cache
VPN
VPN PPN VPN PPN
hash
VPN PPN
= = =
![Page 24: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/24.jpg)
Fully Associative Cache
VPN
VPN PPN VPN PPN
hash
VPN PPN
= = =
![Page 25: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/25.jpg)
Fully Associative Cache
VPN
VPN PPN VPN PPN
hash
VPN PPN
= = =
If miss, translate and replace one of the entries
![Page 26: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/26.jpg)
TLB Lookups
TypicallyTLBs are small and fully associativeHardware caches use direct mapped
or set-associative cache
![Page 27: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/27.jpg)
Replacement of TLB Entries
Direct mappingEntry replaced whenever a VPN
mismatches Associative caches
Random replacementLRU (least recently used)MRU (most recently used)Depending on reference patterns
![Page 28: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/28.jpg)
Replacement of TLB Entries
Hardware-levelTLB replacement is mostly random
• Simple and fast
Software-levelMemory page replacements are more
sophisticatedCPU cycles vs. cache hit rate
![Page 29: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/29.jpg)
Consistency between TLB and Page Tables
Different processes have different page tablesTLB entries need to be invalidated on
context switchesAlternatives:
• Tag TLB entries with process IDs• Additional cost of hardware and
comparisons per lookup
![Page 30: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/30.jpg)
Relationship Between TLB and HW Memory Caches
We can extend the principle of TLB Virtually addressed cache: between
the CPU and the translation tables Physically addressed cache:
between the translation tables and the main memory
![Page 31: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/31.jpg)
Relationship Between TLB and HW Memory Caches
Data reads or writes(untranslated)
VA data
VA data
Virtually addressed cache
TLB PA data
PA data
Physically addressed cache
PA data
PA dataPA data
PA data
Translation tables
![Page 32: Caching and TLBs](https://reader034.vdocuments.site/reader034/viewer/2022051821/56815946550346895dc68342/html5/thumbnails/32.jpg)
Two Ways to Commit Data Changes Write-through: immediately
propagates update through various levels of cachingFor critical data
Write-back: delays the propagation until the cached item is replacedGoal: spread the cost of update
propagation over multiple updatesLess costly