parallel simulation. past, present and future c.d. pham laboratoire resam universit ₫ claude...
Post on 14-Dec-2015
219 Views
Preview:
TRANSCRIPT
Parallel Simulation.Past, Present and Future
C.D. PhamLaboratoire RESAM
Universit₫ Claude Bernard Lyon 1cpham@resam.univ-lyon1.fr
Past
Introduction Discrete Event Simulation (DES) Parallel DES and the synchronization
problems Chandy-Misra-Bryant rules
Architecture of a conservative LP The « Safe is better » approach The lookahead ability
Jeffersons point of view Architecture of an optimistic LP Time Warp
Mixed/adaptive approaches,
Present
Ongoing projects SSF, TeD/GTW, GloMoSim, CSAM.
Future
Challenges & Perspectives Ultra-large scale simulations, Wide-area federation-based simulations, WEB-based simulations.
PAST:the algorithms, only the algorithms!
Simulation To simulate is to reproduce the
behavior of a physical system with a model
Practically, computers are used to numerically simulate a logical model
Simulations are used for performance evaluation and prediction of complex systems fluids dynamic, chemistry reactions (continous) communication network models: routing,
congestion avoidance, mobile (discrete) Simulation is more flexible than
analytical methods
Discrete Event Simulation (DES)
assumption that a system changes its state at discrete points in simulation time
a1 a2 a3 a4d1 d2 d3
S1 S3
S2
0 Dt 2Dt 3Dt 4Dt 5Dt 6Dt
time step
DES concepts
fundamental concepts: system state (variables) state transitions (events) simulation time: totally ordered set of values
representing time in the system being modeled
the system state can only be modified upon reception of an event
modeling can be event-oriented process-oriented
Life cycle of a DES
a DES system can be viewed as a collec-tion of simulated objects and a sequence of event computations
each event computation contains a time stamp indicating when that event occurs in the physical system
each event computation may: modify state variables schedule new events into the simulated future
events are stored in a local event list events are processed in time stamped order usually, no more event = termination
A simple DES model
local event list
A B
5
link model delay = 5send processing time = 5
receive processing time = 1packet arrival
P1 at 5, P2 at 12, P3 at 22
<e4,15> B receive P1 from Ae4<e5,16> B sends ACK(P1) to Ae5
e8 <e8,23> B receive P2 from A
<e2,10> A sends P1 to B e2<e1,5> A receive packet P1 e1
<e6,17> A sends P2 to B e6
<e3,12> A receive packet P2 e3
<e9,22> A receive packet P3 e9
e7<e7,21> A receive ACK(P1)
Why it works?
events are processed in time stamp order
an event at time t can only generate future events with timestamp greater or equal to t (no event in the past)
generated events are put and sorted in the event list, according to their timestamp
the event with the smallest timestamp is always processed first,
causality constraints are implicitly maintained.
Why change? It s so simple!
models becomes larger and larger the simulation time is overwhelming
or the simulation is just untractable example:
parallel programs with millions of lines of codes, mobile networks with millions of mobile hosts, ATM networks with hundreds of complex
switches, multicast model with thousands of sources, ever-growing Internet, and much more...
Some figures to convince...
ATM network models Simulation at the cell-level, 200 switches 1000 traffic sources, 50Mbits/s 155Mbits/s links, 1 simulation event per cell arrival.
simulation time increases as link speed increases,
usually more than 1 event per cell arrival, how scalable is traditional simulation?
More than 26 billions events to simulate 1 second!30 hours if 1 event is processed in 1us
Parallel simulation - principles
execution of a discrete event simulation on a parallel or distributed system with several physical processors.
the simulation model is decomposed into several sub-models that can be executed in parallel spacial partitioning, temporel partitioning,
radically different from simple simulation replications.
Parallel simulation - pros & cons
pros reduction of the simulation time, increase of the model size,
cons causality constraints are difficult to maintain, need of special mechanisms to synchronize
the different processors, increase both the model and the simulation
kernel complexity. challenges
ease of use, transparency.
Parallel simulation - examplelogical process (LP)
packetheventt
parallel
A simple PDES model
local event list
A B
5
link model delay = 5send processing time = 5
receive processing time = 1packet arrival
P1 at 5, P2 at 12, P3 at 22
<e5,16> B sends ACK(P1)e5
<e2,10> A sends P1 to B e2
e6<e6,17> A sends P2 to B
<e1,5> A rec. packet P1 e1
<e3,12> A rec. packet P2 e3<e4,15> B rec. P1 from Ae4
<e8,23> B rec. P2 from Ae8e7<e3,21> A rec. ACK(P1)
t
e9<e9,22> A rec. packet P3
causality error, violation
Synchronization problems
fundamental concepts each Logical Process (LP) can be at a
different simulation time local causality constraints: events in each LP
must be executed in time stamp order synchronization algorithms
Conservative: avoids local causality violations by waiting until it s safe
Optimistic: allows local causality violations but provisions are done to recover from them at runtime
Chandy-Misra-Bryant rules
Architecture of a conservative LPThe « Safe is better » approachThe lookahead ability
Architecture of a conservative LP
LPs communicate by sending non-decreasing timestamped messages
each LP keeps a static FIFO channel for each LP with incoming communication
each FIFO channel (input channel, IC) has a clock ci that ticks according to the timestamp of the topmost message, if any, otherwise it keeps the timestamp of the last message
LPB LPA
LPC LPD
c1=tB1
tB1tB
2
tC3tC
4tC5
tD4
c2=tC3
c3=tD3
A simple conservative algorithm
each LP has to process event in time-stamp order to avoids local causality violations
The Chandy-Misra-Bryant algorithm
while (simulation is not over) { determine the ICi with the smallest Ci
if (ICi empty) wait for a message else { remove topmost event from ICi
process event }}
Safe but has to block
LPB LPA
LPC LPD
36
147
10
5
IC1
IC2
IC3
min IC event
12
31
42
53
BLOCK3
61
729
Blocks and even deadlocks!
S
A
B
M
merge point
BLOCKED
cycle
S sends allmessages to B
444 446
How to solve deadlock: null-messages
SA
B
M
null-messages for artificial propagation of simulation time
10 10
4410 445
67
12
10
UNBLOCKED
What frequency?
How to solve deadlock: null-messages
a null-message indicates a Lower Bound Time Stampminimum delay between links is 4LP C initially at simulation time 0
11 910 7A B C
4
LP C sends a null-message with time stamp 4
LP A sends a null-message with time stamp 8
8
LP B sends a null-message with time stamp 12
12
LP C can process event with time stamp 7
12
The lookahead ability
null-messages are sent by an LP to indicate a lower bound time stamp on the future messages that will be sent
null-messages rely on the « lookahead » ability communication link delays server processing time (FIFO)
lookahead is very application model dependant and need to be explicitly identified
Lookahead for concurrent processing
LPB
LPA
LPC
LPD
s
TA TA+LA
s s
s s
s s
s safe event
unsafe event
What if lookahead is small?a null-message indicates a Lower Bound Time Stamp
minimum delay between links is 4LP C initially at simulation time 0
11 910 7A B C
1
LP C sends a null-message with time stamp 1
LP A sends a null-message with time stamp 2
2
LP B sends a null-message with time stamp 3
3
LP C can process event with time stamp 7
7
1
then 5
5
then 6
6
then 7
7
Conservative: pros & cons
pros simple, easy to implement good performance when lookahead is large
(communication networks, FIFO queue) cons
pessimistic in many cases large lookahead is essential for performance no transparent exploitation of parallelism performances may drop even with small
changes in the model (adding preemption, adding one small lookahead link)
Jeffersons point of view
Architecture of an optimistic LPTime Warp
Architecture of an optimistic LP
LPs send timestamped messages, not necessarily in non-decreasing time stamp order
no static communication channels between LPs, dynamic creation of LPs is easy
each LP processes events as they are received, no need to wait for safe events
local causality violations are detected and corrected at runtime
LPB LPA
LPC LPD
tB1tB
2 tC3tC
4 tC5 tD
4
Processing events as they arrive
11
LPB
13
LPD
18
LPB
22
LPC
25
LPD
28
LPC
36
LPB
32
LPD
LPB
LPA
LPC
LPD
LPA
processed!
what to do with late messages?
Time Warp. Rollback? How?
Late messages are handled with a rollback mechanism undo false/uncorrect local computations,
state saving: save the state variables of an LP reverse computation
undo false/uncorrect remote computations, anti-messages: anti-messages and (real) messages
annihilate each other
process late messages re-process previous messages: processed
events are NOT discarded!
A pictured-view of a rollback
11131822252836
32
4345
25 13state points
anti-msg 13152024273038
1118222832
3438 30
36
The real rollback distance depends on the state saving period: short period reduces rollback overhead but increases state saving overhead
11131822252832364345 283236
unprocessed
processed
Reception of an anti-message
may initiate a rollback if the corresponding positive message has already been processed,
may annihilate the corresponding positive message if it is still unprocessed,
may wait in the input queue if the corresponding positive message has not been received yet.
222528364345
43
22252836434548
222528364345
25 rollback
48
Need for a Global Virtual Time
Motivations an indicator that the simulation time advances reclaim memory (fossil collection)
Basically, GVT is the minimum of all LPs logical simulation time timestamp of all messages in transit
GVT garantees that events below GVT are definitive events (I/O) no rollback can occur before the GVT state points before GVT can be reclaimed anti-messages before GVT can be reclaimed
A pictured-view of the GVT
LPB
LPA
LPC
LPD
c
c
old GVT
c c c
cccc
c c c c
new GVT
c c cc
c
D
conditional event
definitive event
c c
c
c c
c
c
c
c
c
c c
D D
D
D
D
D
D
D
DWAN
TED
Optimistic overheads
Periodic state savings states may be large, very large! copies are very costly
Periodic GVT computations difficult in a distributed architecture, may block computations,
Rollback thrashing cascaded rollback, no simulation progress!
Memory! memory is THE limitation
Optimistic: pros & cons
pros exploits all the parallelism in the model,
lookahead is less important, transparent to the end-user, interactive simulations possible, can be general-purpose.
cons very complex, needs lots of memory, large overheads (state saving, GVT,
rollbacks)
Mixed/adaptive approaches
General framework that (automatically) switches to conservative or optimistic
Adaptive approaches may determine at runtime the amount of conservatism or optimism
conservative optimistic
mixed
messages
performance
optimistic
conservative
PRESENT:how to survive?
and how to get money?
Parallel simulation today
Lots of algorithms have been proposed variations on conservative and optimistic adaptives approaches
Paradoxically few end-users impossible to compete with sequential simulators
in terms of user interface, generability, ease of use...
Ongoing research mainly focus on ultra-large scale simulations of networks, tools and execution environments composability and interoperability issues
Ongoing projects
DOMAINS/GloMoSim SSF TeD/GTW CSAM
DOMAINS/GloMoSim project
Design of Mobile Adaptive Networks, DARPA/DAAB07-97-C-D321
Provides a library for simulating millions of mobile nodes
Proves the efficiency of parallel simulation for scalability issues
Based on the PARSEC simulation language
Conservative or optimistic execution
Glomo objectives
Glomo librairies
SSF-Scalable Simulation Framework
DARPA/ITO (Next Generation Internet Program) and NSF/ANIR (Special Projects in Networking)
SSF proposes discrete event simulations of large complex systems, with serial and scalable parallel implementations
SSFNet is a collection of SSF-based models for simulating Internet protocols and networks
Based on YAWNS, a conservative kernel
SSFNet, modeling the Internet
TeD/GTW
TeD (Telecommunications Description Language) is a language for modeling telecommunicating network elements and protocols (PNNI).
GTW is a general purpose parallel discrete event simulation executive using optimistic synchronization techniques.
The TeD compiler translates TeD models into C++ code which uses GTW for parallel simulation
Modeling with TeD
CSAM
CSAM: Conservative Simulator for ATM network Model
Simulation at the cell-level C++ programming-style, predefined
generic model of sources, switches, links
Test-bed for parallel simulations on high-performance clusters.
Test case: 78-switch ATM network
Distance-Vector Routing with dynamic link cost functionsConnection setup, admission control protocols
CSAM - Some results...
Routing protocols reconfiguration time
CSAM - visualization tool
FUTURE:the great challenges!
Ultra-Large scale simulations,Wide-area federation-based simulations, WEB-based simulations.
Ultra-large scale simulations
Millions of mobile nodes, Thousands of multicast connections, Full Internet simulation, Ultra-large scale simulations require
lots of memory! lots of CPU! new modeling techniques: reuse of model
description, decoupling state from model description;
advanced memory management schemes: shared events, application memory regulation.
« Out of core » simulations.
Federation-based simulations
Cost of model developping is increasing at a high rate.
Reuse and interoperability is a key issue in the development phase of new models.
Need for a unified framework so that independent simulators can run together and achieve a given goal.
The DoD has proposed the High-Level Architectureframework for federation-based simulations
The HLA framework
Runtime Infrastructure (RTI)Federation Management Declaration ManagementObject Management Ownership ManagementTime Management Data Distribution Management
The High Level Architecture calls for a federation of simulations to achieve interoperability and reuse of software.
Federation
Without HLA, simulations are mostly independents and interoperability is not easy.
Logical simulations
Hardware, human-in-the loopreal-time simulators
Display, statistics...
10 Rules for the federationand the federates behavior
An Object Model Template to describe the simulationobjects
An Interface Specification
simulator
real-timeplayers
tools
Wide-area interactive simulations
INTERNET
human in the loopflight simulator
battle field simulation
displaycomputer-basedsub-marine simulator
WEB-Based simulation
Users build models and submit them on the web (meta-computing) Hides the complexity of parallel simulation
techniques Provides computing resources for the users
ASCII RED1st ranktop 500 list
Cplant
?
JTeD project, the Java-based TeD
Summary
Parallel simulation is a mature field Applications, especially
communication network models, are the centre of interest
The challenges are for very-large scale simulations and re-usability of models
Real-time interactive simulations are desirable on a wide-area interconnection
As-fast-as possible simulations will likely remain « indoor » (cluster, SMP)
Requirements put on networking
In wide-area simulation, data distribution relies mainly on multicast and broadcast operations
Near real-time behaviors are desirable for interactive simulation
References
Parallel simulation K. M. Chandy and J. Misra, Distributed Simulation: A
Case Study in Design and Verification of Distributed Programs, IEEE Trans. on Soft. Eng., 1979, pp440-452
R. Fujimoto, Parallel Discrete Event Simulation, Comm. of the ACM, Vol. 33(10), Oct. 90, pp31-53
HLA http://hla.dmso.mil
Projects GlomoSim - http://pcl.cs.ucla.edu/projects/glomosim SSF - http://www.ssfnet.org/homePage.html TeD/GTW - CSAM - http://resam.univ-lyon1.fr/CSAM
top related