technische universität darmstadt - improving parallel mpsoc … · 2020. 7. 16. · 17 01.08.2013....
TRANSCRIPT
Institut für Technik der Informationsverarbeitung (ITIV)
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu
Improving Parallel MPSoC Simulation Performance by Exploiting Dynamic Routing Delay Prediction
Christoph Roth, Harald Bucher, Simon Reder, Oliver Sander, Jürgen Becker
Institut für Technik der Informationsverarbeitung (ITIV)2 01.08.2013
Outline
MotivationBaseline Modeling and Simulation MethodologyDynamic Routing Delay Prediction
OverviewRouter Shortest Path GraphLocal Quantum CalculationExecution Semantics of the Lightweight Scheduler
Case StudyConclusion and Outlook
Institut für Technik der Informationsverarbeitung (ITIV)
Motivation I
Evaluation of Network-on-Chip (NoC) based Multi-Processor System-on-Chip (MPSoC) platforms regarding throughput, latency or jitter demands for …
rendering of effects like internal network congestion and buffer contention detailed cycle-timed modeling of mechanisms like routing algorithm or flow control in order to measure their impactconsideration of the whole NoC and not only a single routerutilization of realistic application-generated traffic
3 01.08.2013
Full system simulation is a commonly used approach forperforming such evaluations
Institut für Technik der Informationsverarbeitung (ITIV)
Motivation II
The gap between MPSoC complexity and productivity in design tools evermore increasesSimulation-based evaluation and verification consumes much of the MPSoC design effort
4 01.08.2013
Source: http://www.imd.uni-rostock.de/noc/index.php?id=36
Source:http://www.intel.com/pressroom/archive/releases/2009/20091202comp_sm.htm
Institut für Technik der Informationsverarbeitung (ITIV)
Abstraction of communication and computation in transaction level modeling (TLM)
Possible Solution I: Raising Abstraction Level
5 01.08.2013
Source: Cai, L. & Gajski, D. Transaction level modeling: an overview Hardware/Software Codesign andSystem Synthesis, 2003. First IEEE/ACM/IFIP International Conference on, 2003, 19-24
Institut für Technik der Informationsverarbeitung (ITIV)
Possible Solution II: Parallel Simulation
6 01.08.2013
Parallel SystemC framework: Runtime system for simulatingmulti-cores on multi-cores
Support for different execution strategies like logical process(LP) based parallel discrete event simulation (PDES)
C. Roth et al., “Asynchronous parallel MPSoC simulation on the Single-Chip CloudComputer,” in System on Chip (SoC), 2012 International Symposium on, 2012.
Institut für Technik der Informationsverarbeitung (ITIV)
Socket Links
Lightweight SchedulerCtrlTrans
Lightweight Processesvbuf
Lightweight Scheduler
CtrlTrans
Lightweight Processes ibuf
vbuf
lclock lclock
nbuf
ibuf
nbuf
DataTransDataTrans
Baseline TLM Method for NoC-based MPSoCs
01.08.2013
Inherently parallelizable transaction level modeling methodIntra-module behavior (e.g. router):
Lightweight scheduler and processes implement synchronous execution semanticsMaintenance of a local time in order to be able to run ahead of the global time
Inter-module behavior (inspired by *):Transaction-based communicationAdjacent modules regularly synchronize via control transactions after a predefinedglobal time quantumExploitation of buffer synchronicity for temporal independencyBetter performance but loss of timing accuracy if time quantum is
chosen larger than one clock cycle
* A. Mello et al., “Parallel simulation of systemc tlm 2.0 compliant mpsoc on smp workstations,” in Design, Automation Test in Europe Conference Exhibition (DATE), 2010, 2010.
Goal: Dynamically adapt temporal decoupling locally to avoid lossof accuracy
Institut für Technik der Informationsverarbeitung (ITIV)
Proposed Extension within this Work
01.08.2013
Dela
y An
nota
tions
/ Pr
edic
tion
Rule
s Fixed Quantum
Institut für Technik der Informationsverarbeitung (ITIV)
Dynamic Routing Delay Predicition
01.08.2013
Cros
sbar
Example: TL model of the Hermes* routerISO/OSI reference model analogy:
Buffers = Physical Layer (Layer 1)Input Controllers = Data Link Layer (Layer 2)Switch Controller= Network Layer (Layer 3)
Idea:
Exploit higher layer delays (beyond layer 1) as guarantee fortemporal indenpendency
* F. Moraes et al., “HERMES: an infrastructure for low area overhead´packet-switching networks on chip,” Integr. VLSI J., vol. 38, 2004.
Institut für Technik der Informationsverarbeitung (ITIV)
Router Shortest Path Graph (RSPG)
01.08.2013
),( ,1,west
jieastji ibufvbufd eastibuf
westibuf
localibuf
eastvbuf
westvbuf
localvbuf
),( ,1,east
jiwestji ibufvbufd
),( ,, jilocallocal
ji ibufvbufd
),( ,,1eastji
westji ibufvbufd
),( ,,1westji
eastji ibufvbufd
),( ,,localjiji
local ibufvbufd
jir ,
),( westeast vbufibufd
),( localeast vbufibufd
A is a directed graph with vertices , edges and edge weights . Two vertices and are connected by a directed edge
if there exists a causal dependency in terms of a minimum delay bound after which may affect . The weight of the edge is the minimum delay bound .
Vv EeDd xv yv
),(, yxyx vve xv yv yxe ,
),( yx vvd
),,( DEVRSPG
Institut für Technik der Informationsverarbeitung (ITIV)
Calculation of Local Time Quanta and Next Synchronization Times between Modules
11 01.08.2013
Step I: Calculate Dynamic Lookahead from RSPG (delaysare annotated to behavioral model)
Step II: Calculate Local Time Quantum and Next Synchronization Time of module connection
Institut für Technik der Informationsverarbeitung (ITIV)
Appropriate Lightweight Scheduler
12 01.08.2013
Institut für Technik der Informationsverarbeitung (ITIV)
Layer 1: Buffer delay is still a constant of one clock cycleLayer 2 & 3: Annotation of state depending delays to Input Control (left) and Switch Control (right) state machines
From these the dynamic lookahead can be calculated withinthe sync()action of the lightweight scheduler by callingcalc_lookahead()
Case Study: Hermes Multiprocessor System
13 01.08.2013
S_INIT
S_IDLE
S_ARBITRATE
S_ROUTE
S_CONNECT
S_END
6
5
6
46
3
2
1
S_IDLE
S_WAIT
S_HEAD
S_PL
S_END
S_END2
2
3
2
1
1
5
4
3
2
1
Institut für Technik der Informationsverarbeitung (ITIV)
Implementation of Dynamic Lookahead Calculation
14 01.08.2013
Institut für Technik der Informationsverarbeitung (ITIV)
Evaluation: Host Independent Characteristics
15 01.08.2013
Institut für Technik der Informationsverarbeitung (ITIV)
Evaluation: Host Dependent Characteristics
16 01.08.2013
Core i7 930 Single-chip Cloud Computer
Institut für Technik der Informationsverarbeitung (ITIV)
Conclusion & Outlook
Presentation of an extension for a modeling strategy for NoC-based MPSoCs using SystemC/TLM
Maintenance of cycle accuracy despite temporal decoupling is possible by dynamically predicting and adapting the degree of temporal decoupling
The number of control transactions is significantly reduced compared to the baseline strategy. This improves performance on different types of hosts.
Future work comprises:Application of the method to other types of modules like network interface of processorDeeper investigation of scalability and application dependencyAutomization of model partitioning and host mapping
17 01.08.2013
Institut für Technik der Informationsverarbeitung (ITIV)
Thank you very much for your attention!
18 01.08.2013