on the meltdown & spectre design flawson the meltdown & spectre design flaws slides taken...
TRANSCRIPT
![Page 1: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/1.jpg)
On the Meltdown & Spectre Design Flaws
Slides taken from Dr. Mark Hill
With some small changes.
Any errors introduced are my own.
![Page 2: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/2.jpg)
Executive Summary
Architecture 1.0: the timing-independent functional behavior of a computer
Micro-architecture: the implementation techniques to improve performance
Question: What if a computer that is completely correct by Architecture 1.0
can be made to leak protected information via timing, a.k.a., Micro-Architecture?
Implication: The definition of Architecture 1.0 is inadequate to protect information
Meltdown leaks kernel
memory, but software &
hardware fixes exist
Spectre leaks memory
outside of bounds checks or
sandboxes, and is scary
![Page 3: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/3.jpg)
Outline
Computer Architecture & Micro-Architecture Review
Timing Side-Channel Attack
Virtual Memory Stuff
Meltdown
Spectre
Wrap-Up
![Page 4: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/4.jpg)
Computer Architecture 0.0 -- Pre-1964
Software Lagged Hardware
● Each new machine design was different
● Software needed to be rewritten in assembly/machine language
● Unimaginable today
Going forward: Need to separate HW interface from implementation
Each Computer was New
● Implemented machine (has mass) → hardware
● Instructions for hardware (no mass) → software
Computer Architecture & Micro-Architecture Review
![Page 5: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/5.jpg)
Computer Architecture 1.0 -- Born 1964
IBM System 360 defined an instruction set architecture
● Stable interface across a family of implementations
● Software did NOT have to be rewritten
Architecture 1.0: the timing-independent functional behavior of a computer
Micro-architecture: implementation techniques that change timing to go fast
Note: The code is not IBM 360 assembly, but is the example used later.
branch (R1 >= bound) goto error
load R2 ← memory[train+R1]
and R3 ← R2 && 0xffff
load R4 ← memory[save+SIZE+R3]
Computer Architecture & Micro-Architecture Review
![Page 6: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/6.jpg)
Micro-architecture Harvested Moore’s Law Bounty
For decades, every ~2 years: 2x transistors, 1.4x faster & 1x chip power possible;
2300 transistors for Intel 4004 → millions per core & billions for caches
(Micro-)architects took this ever doubling budget to make each processor core
execute > 100x than what it would otherwise.
Key techniques w/ tutorial next:
● Instruction Speculation
● Hardware Caching
Hidden by Architecture 1.0: timing-independent functional behavior unchanged
Computer Architecture & Micro-Architecture Review
![Page 7: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/7.jpg)
Instruction Speculation Review
Multi-cycle:
add
Predict direction: target or fall thru
Pipelining, branch prediction, & instruction speculation
add
load
branch
and Speculate!
store Speculate more!
load
Speculation correct: Commit architectural changes of and (register) & store (memory) go fast!
Mis-speculate: Abort architectural changes (registers, memory); go in other branch direction
Computer Architecture & Micro-Architecture Review
![Page 8: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/8.jpg)
Hardware Caching Review
Main Memory (DRAM) too slow
Add Hardware Cache(s): small, transparent hardware memory
E.g., 4-entry direct-mapped cache
--0
--1
--2
--3
12?
MissInsert 12
120
--1
--2
--3
07?
MissInsert 07
120
--1
--2
073
12?
HIT!
No
changes
120
--1
--2
073
16?
MissVictim 12
Insert 16
160
--1
--2
073
Note 12
victimized
“early” due
to “alias”
Computer Architecture & Micro-Architecture Review
![Page 9: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/9.jpg)
Micro-architecture Harvested Moore’s Law Bounty
For decades, every ~2 years: 2x transistors, 1.4x faster & 1x chip power possible;
2300 transistors for Intel 4004 → millions per core & billions for caches
(Micro-)architects took this ever doubling budget to make each processor core
execute > 100x what it would otherwise
Hidden by Architecture 1.0: timing-independent functional behavior unchanged
branch (R1 >= bound) goto error ; Speculate branch not taken
load R2 ← memory[train+R1] ; Speculate load & speculate cache hit
and R3 ← R2 && 0xffff ; Speculate AND
load R4 ← memory[save+SIZE+R3] ; Speculate load & speculate cache hit
Computer Architecture & Micro-Architecture Review
![Page 10: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/10.jpg)
Whither Computer Architecture 1.0?
Architecture 1.0: timing-independent functional behavior
Question: What if a computer that is completely correct by Architecture 1.0
can be made to leak protected information via timing, a.k.a., micro-architecture?
Implication: The definition of Architecture 1.0 is inadequate to protect information
This is what Meltdown and Spectre do. Let's see why and explore implications.
Computer Architecture & Micro-Architecture Review
![Page 11: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/11.jpg)
Side channel attack
● A side-channel attack is any attack based on information gained from the
implementation of a computer system (Wikipedia)
● Example:○ Encryption algorithms like AES do a lot of XOR operations against a secret key.
○ If you can get them to XOR against values you control AND if XOR takes more power if the
result is “1” than if it is “0”
○ Then you can just feed the device values to XOR the key with (say 0x01, then 0x02, then
0x04, etc.) and watch the power consumption (really closely)
○ Can figure out the key!
Timing Side-Channel Attack
![Page 12: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/12.jpg)
Other weird side-channel attacks
● Time to do a computation might vary depending on values.
○ There was one attack that watched a blinking LED on the device to measure delays and thus
get secret information (with a really high-speed photodetector).
○ Old Unix machines took a variable amount of time to check to see if a password was correct
based on the password tried and the actual password.
■ So timing attack could reveal actual password.
● Often not too hard to fix
○ “Just” make everything take the same amount of time (Slower, but safer).
Timing Side-Channel Attack
![Page 13: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/13.jpg)
Basic idea
1. Load two different memory locations
○ Data gets in cache (assume DM) in two different lines (Call them line “A” and line “B”).
2. Load data you aren’t allowed to access.○ Call that data the secret
○ Will cause a fault
○ But only once the load hits the head of the RoB!
3. Load a memory location based on one bit of that secret○ Do it so you either get placed in “A” or “B”. Will kick out the data in those.
4. Fault happens. Recover as normal○ All instuctions after the fault are nuked. Almost as if they had never happened.
○ But cache has changed!
5. Now load original data○ One of the two will be slow.
○ Now you know one bit of the secret.
Timing Side-Channel Attack
![Page 14: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/14.jpg)
But if you aren’t allowed to access the data…
● There are a bunch of fun things going on at once.○ The processor (at least on Intel machines and maybe some ARM machines) will grab the data
even if you don’t have permission to do so.
○ The page table for a user process generally includes mappings to a bunch of things that user
process isn’t allowed to touch
■ All of Kernel stuff
■ All of physical memory
● So you can go hunting for data.
Timing Side-Channel Attack
![Page 15: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/15.jpg)
Meltdown (https://meltdownattack.com/meltdown.pdf)
Can leak the contents of kernel memory at up to 500KB/s
Meltdown
![Page 16: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/16.jpg)
Meltdown & Hardware
Demonstrated for many Intel x86-64 cores; NOT demonstrated for AMD
Key: When to suppress load with protection violation (user load to kernel memory)
● EARLY: AMD appears to suppress early, e.g., at TLB access
● LATE: Intel appears to suppress at end after micro-arch state changes
A SWAG (Scientific Wild A** Guess) why
● Both are correct by Architecture 1.0
● Performance shouldn’t matter as this case is supposed to be rare
● Do what’s easiest & have luck that is good (AMD) or bad (Intel)
Meltdown
![Page 17: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/17.jpg)
Meltdown & Software
Bad: Meltdown operates with bug-free OS software (by Architecture 1.0)
Good: Major commercial OSs patched for Meltdown ~January 2018
Idea: Don’t map (much) of protected kernel address space in user process
● Offending load now fails address translation & does nothing
● Patches quickly derived from KAISER developed for side-channel attacks of
Kernel Address Space Layout Randomization (KASLR)
● Performance impact 0-30% syscall frequency & core model.
Future hardware can fix Meltdown (like AMD) so maybe we dodged a bullet
Meltdown
![Page 19: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/19.jpg)
Need Computer Architecture 2.0?
With Meltdown & Spectre, Architecture 1.0 is inadequate to protect information
Augment Architecture 1.0 with Architecture 2.0 specification of
● (Abstraction of) time-visible micro-architecture?
● Bandwidth of known (unknown?) timing channels?
Change Microarchitecture to mitigate timing channel bandwidth
● Suppress some speculation
● Undo most changes on mis-speculation
Can this be (formally) solved or must it be managed like crime?
Wrap up
![Page 20: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/20.jpg)
Need Computer Architecture 2.0?
More generally, can we reduce our dependence on SPECULATION?
Wrap up
![Page 21: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/21.jpg)
Executive Summary
Architecture 1.0: the timing-independent functional behavior of a computer
Micro-architecture: the implementation techniques to improve performance
Question: What if a computer that is completely correct by Architecture 1.0
can be made to leak protected information via timing, a.k.a., Micro-Architecture?
Implication: The definition of Architecture 1.0 is inadequate to protect information
Meltdown leaks kernel
memory, but software &
hardware fixes exist
Spectre leaks memory
outside of bounds checks or
sandboxes, and is scary
Wrap up
![Page 22: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/22.jpg)
Some References
New York Times: https://www.nytimes.com/2018/01/03/business/computer-flaws.html
Meltdown paper: https://meltdownattack.com/meltdown.pdf
Spectre paper: https://spectreattack.com/spectre.pdf
A blog separating the two bugs: https://danielmiessler.com/blog/simple-explanation-difference-meltdown-spectre/
Google Blog: https://security.googleblog.com/2018/01/todays-cpu-vulnerability-what-you-need.html and
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
Industry News Sources: https://arstechnica.com/gadgets/2018/01/whats-behind-the-intel-design-flaw-forcing-numerous-
patches/ and https://www.theregister.co.uk/2018/01/02/intel_cpu_design_flaw/
Wrap up
![Page 23: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/23.jpg)
![Page 24: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/24.jpg)
Final ExamAnd other closing stuff
![Page 25: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/25.jpg)
● First line of the TCL script:○ set hdlin_ff_always_sync_set_reset true
○ Helps deal with reset problems.
● What is it doing?○ http://www.sunburst-design.com/papers/CummingsSNUG2003Boston_Resets.pdf page 9
covers this.
○ By the way, Cliff Cummings (first author on that report) is the best resource for Verilog advice
and the like. I rely on his stuff for nearly any deep problem I find.
![Page 26: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/26.jpg)
![Page 27: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/27.jpg)
![Page 28: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/28.jpg)
● The inputs to both legs of the MUX can be forced to 0 by holding rst_n
asserted low, however if ld is unknown (X) and the MUX model is pessimistic,
then the flops will stay unknown (X) rather than being reset.
● Why would the model be “pessimistic”? ○ Hey, another Sutherland paper. http://www.sutherland-hdl.com/papers/2013-DVCon_In-love-
with-my-X_paper.pdf
○ I’m not sure I get why you’d want to be pessimistic in this case, but sure, I can see the tool
thinking the value out of the MUX might be X if ld is X.
![Page 29: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/29.jpg)
Review/Q&A stuff
● Wednesday 4/18 9:30-11:30am. ○ Room TBA
● Sunday 4/22○ Time and date TBA.
![Page 30: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/30.jpg)
Coverage.
● Expect a Tomasulo’s “fill in the boxes” question at the end.
○ Be sure to look over the algorithms you aren’t using
○ And even review the one you are (you may have made some changes)
● Expect a focus on:
○ Power
○ Multi-core cache coherence
○ Static optimizations
![Page 31: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/31.jpg)
Class summary
• Major topics– ILP in hardware (Out-of-order processors)
• How they work AND why we use them
– Caches and Virtual Memory
– Multi-processor
– ILP in software (Complier, IA-64)
– Power
• Less major topics– Memory disambiguation
– Branch prediction• Direction and target
– Advanced OoO issues• Superscalar, instruction scheduling, multi-threading, etc.
![Page 32: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/32.jpg)
The big questions
• What is computer architecture?
• What are the metrics of performance?
• What are the techniques we use to maximize these
metrics?
![Page 33: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/33.jpg)
ILP in hardware (1/2)
• ILP definitions– Hazards vs dependencies
• Data, Name and Control dependencies
– What ILP means and finding it.
• Dynamic Scheduling– Tomasulo’s (three versions!)
• You can be promised a question on this!
• Branch Prediction– Local, global, hybrid/correlating
• Tournament and gshare
– BTBs
![Page 34: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/34.jpg)
ILP in hardware (2/2)
• Multiple Issue
– Static• Static Superscalar
• VLIW
– Dynamic superscalar
• Speculation
– Branch, data
• ILP limit studies
![Page 35: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/35.jpg)
ILP in hardware: Questions
• True or False
1. The original T-algorithm only allows reordering within basic blocks
2. In P6, if it weren’t for precise interrupts, it would be okay to retire
instructions out-of-order as long as they had finished executing and a
branch isn’t skipped over.
3. ILP in hardware is limited in scope due to the “instruction window” which
is basically the size of the RS.
![Page 36: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/36.jpg)
Quick idea: SMT
• One processor, two threads.
![Page 37: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/37.jpg)
Caching (1/2)• There is a huge amount of stuff associated
with caching. The important stuff– Locality
• Temporal/Spatial
• 3’Cs model
• Stack distance model
– Nuts-and-bolts• Replacement policies (LRU, pseudo-LRU)
• Performance (hit rate, Thit; Tmiss, average access time)
• Write back/Write thru
• Block size
– Basic improvement• Multi-level cache
• Critical word first
• Write buffers
![Page 38: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/38.jpg)
Caching (2/2)
• Non-standard caches
– Hash
– Victim
– Skew
• Misc.
– Virtual addresses and caching
– Impact of prefetching
– Latency hiding with OO execution
![Page 39: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/39.jpg)
Cache: Questions (1/2)
• Changing __________ has an impact on compulsory misses.
• A victim cache is more likely to help with ________ than ________ though it can help both (3’Cs)
• At least _____ bits are required to keep exact track of LRU in a 5-way associative cache.
![Page 40: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/40.jpg)
Cache question (2/2)
• A ____________ cache has a number of sets equal to
the number of lines in the cache.
• A fully-associative cache with N lines will miss an access
that has a stack distance of ________ (state the largest
range you can).
![Page 41: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/41.jpg)
Multi-processor
• Amdahl’s law as it applies to MP.
• Bus-based multi-processor– Snooping
– MESI
– Bus transaction types (BRL etc.)
• Distributed-shared– Directory schemes
• Synchronization– Critical sections
– Spin-locks
![Page 42: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/42.jpg)
Multi-processor: Question
• Under the MESI protocol what is the
advantage of having a distinct clean and
dirty exclusive state?
![Page 43: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/43.jpg)
Software techniques for ILP (1/2)
• Pipeline scheduling– Reordering instructions in a basic block to remove pipe stalls
– Loop unrolling
• Static information passed to processor – Static branch prediction
– Static dependence information
• Loop issues– Detecting loop dependencies
– Software pipelining
![Page 44: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/44.jpg)
Software techniques for ILP (2/2)
• Global code scheduling
– Predicated instruction and CMOV
– Memory reference speculation
– Issues with preserving exception behavior
• IA-64 as a case study of hardware support for software ILP
techniques
– Speculative loads
– Advanced loads
– Software pipelining optimizations
![Page 45: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/45.jpg)
Software techniques for ILP: Questions
• What is the most significant disadvantage of
loop unrolling?
• Using CMOV re-write the following code
snippet, removing the branch. Don’t change
exception behavior and assume DIV only
causes an exception if R3=0
BNE R1 R2 skip
R1=R2/R3
skip: nop
![Page 46: On the Meltdown & Spectre Design FlawsOn the Meltdown & Spectre Design Flaws Slides taken from Dr. Mark Hill With some small changes. Any errors introduced are my own](https://reader036.vdocuments.site/reader036/viewer/2022062505/5ed348e391d6e046101ed731/html5/thumbnails/46.jpg)
Power
• Understand why it’s important
• Power vs. Energy
• How it’s related to the existence of multi-core
• Understand voltage scaling issues