comp25212 cpu multi threading learning outcomes: to be able to: –describe the motivation for...
Post on 04-Jan-2016
222 Views
Preview:
TRANSCRIPT
-
COMP25212 CPU Multi ThreadingLearning Outcomes: to be able to:Describe the motivation for multithread support in CPU hardwareTo distinguish the benefits and implementations of coarse grain, fine grain and simultaneous multithreadingTo explain when multithreading is inappropriateTo be able to describe a multithreading implementationsTo be able to estimate performance of these implementationsTo be able to state important assumptions of this performance model
-
Revision: IncreasingCPU PerformanceData CacheFetch LogicFetch LogicDecode LogicFetch LogicExec LogicFetch LogicMem LogicWrite LogicInst CacheHow can throughput be increased? Clocka
c
b
d
f
e
-
Increasing CPU PerformanceBy increasing clock frequencyBy increasing Instructions per ClockMinimizing memory access impact data cacheMaximising Inst issue rate branch predictionMaximising Inst issue rate superscalarMaximising pipeline utilisation avoid instruction dependencies out of order execution(What does lengthening pipeline do?)
-
Increasing Program ParellelismKeep issuing instructions after branch?Keep processing instructions after cache miss?Process instructions in parallel?Write register while previous write pending?
Where can we find additional independent instructions?In a different program!
-
Revision Process StatesTerminatedRunning on a CPUBlocked waiting for eventReady waiting for a CPUNewDispatch (scheduler)Needs to wait(e.g. I/O)I/O occursPre-empted(e.g. timer)
-
Revision Process Control BlockProcess IDProcess StatePCStack PointerGeneral RegistersMemory Management Info
Open File List, with positionsNetwork ConnectionsCPU time usedParent Process ID
-
Revision: CPU SwitchProcess P0Process P1Operating SystemSave state into PCB0Load state fromPCB1Save state into PCB0Load state fromPCB1
-
What does CPU load on dispatch?Process IDProcess StatePCStack PointerGeneral RegistersMemory Management Info
Open File List, with positionsNetwork ConnectionsCPU time usedParent Process ID
-
What does CPU need to store on deschedule?Process IDProcess StatePCStack PointerGeneral RegistersMemory Management Info
Open File List, with positionsNetwork ConnectionsCPU time usedParent Process ID
-
CPU Support for Multithreading
-
How Should OS View Extra Hardware Thread?A variety of solutions
Simplest is probably to declare extra CPU
Need multiprocessor-aware OS
-
CPU Support for MultithreadingDesign Issue:when to switch threads
-
Coarse-Grain MultithreadingSwitch Thread on expensive operation:E.g. I-cache missE.g. D-cache miss
Some are easier than others!
-
Switch Threads on Icache miss
1234567Inst aIFIDEXMEMWBInst bIFIDEXMEMWBInst cIF MISSIDEXMEMWBInst dIFIDEXMEMInst eIFIDEXInst fIFID
Inst XInst YInst Z
----
-
Performance of Coarse GrainAssume (conservatively) 1GHz clock (1nS clock tick!), 20nS memory ( = 20 clocks)1 i-cache miss per 100 instructions1 instruction per clock otherwiseThen, time to execute 100 instructions without multithreading100 + 20 clock cyclesInst per Clock = 100 / 120 = 0.83.With multithreading: time to exec 100 instructions:100 [+ 1]Inst per Clock = 100 / 101 = 0.99..
-
Switch Threads on Dcache missPerformance:similar calculation (STATE ASSUMPTIONS!)
Where to restart after memory cycle? I suggest instruction a why?Abort these
1234567Inst aIFIDEXM-MissWBInst bIFIDEXMEMWBInst cIFIDEXMEMWBInst dIFIDEXMEMInst eIFIDEXInst fIFID
MISSMISSMISS
---
---
---
Inst XInst Y
By increasing clock frequency
By increasing Instructions per Clock
Minimizing memory access impact data cache
Maximising Inst issue rate branch prediction
Maximising inst issue rate superscalar
Maximising pipeline utilisation avoid instruction dependencies out of order execution
**
top related