conservative simulation using distributed-shared memory

29
PADS 2002 1 Conservative Simulation using Distributed-Shared Memory Teo, Y. M., Ng, Y. K. and Onggo, B. S. S. Department of Computer Science National University of Singapore

Upload: binta

Post on 14-Jan-2016

32 views

Category:

Documents


2 download

DESCRIPTION

Conservative Simulation using Distributed-Shared Memory. Teo, Y. M., Ng, Y. K. and Onggo, B. S. S. Department of Computer Science National University of Singapore. Objectives. Improve performance of SPaDES/Java by reducing overhead: Synchronization of events Distributed communications - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Conservative Simulation using Distributed-Shared Memory

PADS 2002 1

Conservative Simulation usingDistributed-Shared Memory

Teo, Y. M., Ng, Y. K. and Onggo, B. S. S.

Department of Computer ScienceNational University of Singapore

Page 2: Conservative Simulation using Distributed-Shared Memory

PADS 2002 2

Improve performance of SPaDES/Java by reducing overhead:

Synchronization of events Distributed communications

Study the memory requirements in parallel simulations.

ObjectivesObjectives

Page 3: Conservative Simulation using Distributed-Shared Memory

PADS 2002 3

Presentation OutlinePresentation Outline

Parallel SimulationNull Message ProtocolPerformance ImprovementMemory RequirementConclusion

Page 4: Conservative Simulation using Distributed-Shared Memory

PADS 2002 4

Parallel SimulationParallel Simulation

Sequential simulations execute on a single thread in one processor.

Ideally, parallelizing the simulation should enhance its real-time performance since the workload is distributed.

The need to maintain causality throughout a parallel simulation => Event synchronization protocols.

=> Adds to inter-process communications.

=> New bottleneck!

Page 5: Conservative Simulation using Distributed-Shared Memory

PADS 2002 5

Null Message ProtocolNull Message Protocol

First designed by Chandy and Misra (1979).Prevents deadlock situations between LPs.LPi sends null messages to each of its neighbours

at the end of every simulation pass, with timestamp = local virtual time of LPi.

Timestamp on null message, T, indicates that the source LP will not send any messages to other LPs before T.

Page 6: Conservative Simulation using Distributed-Shared Memory

PADS 2002 6

LP 4

4

4

4

Null Message ProtocolNull Message Protocol

Clock = 4

LP

FEL

4

7

LP

LP

Page 7: Conservative Simulation using Distributed-Shared Memory

PADS 2002 7

Chandy-Misra-Byrant’s (CMB) protocol performs poorly due to high null message overhead. It transmits null msgs on every simulation pass

NMR ~> 1 for nearly all [0, T).

Optimizations incorporated: Carrier-null message scheme Flushing mechanism Demand-driven null message algorithm Remote communications using JavaSpace

Performance ImprovementPerformance Improvement

Page 8: Conservative Simulation using Distributed-Shared Memory

PADS 2002 8

Carrier-Null Message AlgorithmCarrier-Null Message Algorithm

Problem with cyclic topologiesUse carrier-null message algorithm (Wood,

Turner, 1996)Avoids transmissions of redundant null

messages in such cycles.

Page 9: Conservative Simulation using Distributed-Shared Memory

PADS 2002 9

Output Channel (A)

2520 35

REQ

30

Request Channel (B)

LogicalProcess

(A) LogicalProcess

(B)

FEL 20

1835

Flusher

Performance ImprovementPerformance Improvement

Demand driven null messaging + flushing

Page 10: Conservative Simulation using Distributed-Shared Memory

PADS 2002 10

Experiments conducted usingPC cluster of 8 nodes running RedHat

Linux version 7.0. Each node is a Pentium II 400 MHz processor with 256 MB of memory connected through 100 Mbps switch.

2 benchmark programs PHOLD system Linear Pipeline

Performance EvaluationPerformance Evaluation

Page 11: Conservative Simulation using Distributed-Shared Memory

PADS 2002 11

PHOLD (3x3, PHOLD (3x3, mm))

Node Node Node

Node

Node

Node

Node

Node

Node

Closed system

Page 12: Conservative Simulation using Distributed-Shared Memory

PADS 2002 12

Linear Pipeline (4, Linear Pipeline (4, ))Open system

ServiceCenter

ServiceCenter

ServiceCenter

ServiceCenter

Customer population

Depart

Page 13: Conservative Simulation using Distributed-Shared Memory

PADS 2002 13

PHOLD (PHOLD (nn x x n, mn, m))

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

4 x 4 8 x 8 16 x 16

Problem Size (n x n)

NM

R

CMB (m=1)

CMB (m=8)

CMB (m=16)

Carrier-null (m=1)

Carrier-null (m=8)

Carrier-null (m=16)

Flushing (m=1)

Flushing (m=8)

Flushing (m=16)

Demand-driven (m=1)

Demand-driven (m=8)

Demand-driven (m=16)

CMB

+ Carrier-Null

+ Flushing

+ Demand-driven null msging

Page 14: Conservative Simulation using Distributed-Shared Memory

PADS 2002 14

Linear Pipeline Linear Pipeline (n,(n, ))

0.4

0.5

0.6

0.7

0.8

0.9

1

4 8 12 16

Problem size (n)

NM

R

CMB / Carrier-null (0.2)

CMB / Carrier-null (0.4)

CMB / Carrier-null (0.6)

CMB / Carrier-null (0.8)

Flushing (0.2)

Flushing (0.4)

Flushing (0.6)

Flushing (0.8)

Demand-driven (0.2)

Demand-driven (0.4)

Demand-driven (0.6)

Demand-driven (0.8)

CMB + Carrier-Null

+ Flushing

+ Demand-driven null msging

Page 15: Conservative Simulation using Distributed-Shared Memory

PADS 2002 15

%tage Reduction in NMR:PHOLD system

CMB Carrier-null 30% Flushing incorporated 42%

Demand-driven null msg 55%Linear Pipeline

CMB Carrier-null 0% Flushing incorporated 23%

Demand-driven null msg 35%

Performance SummaryPerformance Summary

Page 16: Conservative Simulation using Distributed-Shared Memory

PADS 2002 16

Distributed CommunicationsDistributed Communications

Originally, SPaDES/Java uses the RMI library to transmit messages between remote LPs. But the serialization phase presents a bottleneck.

Previous performance optimization effort: message deflation.

Only solution to overcome remote communications overhead => send less messages. How?

Target at null messages.

Page 17: Conservative Simulation using Distributed-Shared Memory

PADS 2002 17

JavaSpacesJavaSpaces

A special Java-Jini service developed by Sun Microsystems, Inc., built on top of Java’s RMI, mimicking a tuple space.

Abstract platform for developing complex distributed applications.

Distributed data persistence.Holds objects, known as entries, with variable

attribute types.Key concept: matching of attribute types/values.

Page 18: Conservative Simulation using Distributed-Shared Memory

PADS 2002 18

JavaSpacesJavaSpaces

Client Client

write

Notifier

notify

read

take

4 generic operations: write, read, take and notify.

Page 19: Conservative Simulation using Distributed-Shared Memory

PADS 2002 19

Replace the RMI communication module in SPaDES/Java with one running on a single JavaSpace.

Use a FrontEndSpace: permits crash recovery of entries in the space.

Transmission of processes and null messages between remote hosts go through theFrontEndSpace as space entries.

Distributed CommunicationsDistributed Communications

Page 20: Conservative Simulation using Distributed-Shared Memory

PADS 2002 20

LP1 LP2

Space Communications : Space Communications : ProcessesProcesses

Time = 0Time = t > 0

SProcess

receiver = 1

SProcess

sender = 2

receiver = 1

……..

SProcess

receiver = 2

Page 21: Conservative Simulation using Distributed-Shared Memory

PADS 2002 21

LP1 LP2

Space Communications :Space Communications :Null MessagesNull Messages

NullMsg

sender = 2

……..

Req

sender = 2

LP3

LP4

Req

sender = 2

Page 22: Conservative Simulation using Distributed-Shared Memory

PADS 2002 22

Performance Evaluation – Performance Evaluation – PHOLD(PHOLD(nn x x nn, , mm) )

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

4 x 4 8 x 8 16 x 16

Problem Size (n x n)

NM

R

RMI/J avaSpace (1 processor, m=1)

RMI/J avaSpace (1 processor, m=8)

RMI/J avaSpace (1 processor, m=16)

RMI (4 processors, m=1)

RMI (4 processors, m=8)

RMI (4 processors, m=16)

RMI (8 processors, m=1)

RMI (8 processors, m=8)

RMI (8 processors, m=16)

J avaSpace (4 processors, m=1)

J avaSpace (4 processors, m=8)

J avaSpace (4 processors, m=16)

J avaSpace (8 processors, m=1)

J avaSpace (8 processors, m=8)

J avaSpace (8 processors, m=16)

RMI

JavaSpace (4 procs)

JavaSpace (8 procs)

Page 23: Conservative Simulation using Distributed-Shared Memory

PADS 2002 23

Overall Performance Evaluation – Overall Performance Evaluation – PHOLD(PHOLD(nn x x nn, , mm) )

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

4 x 4 8 x 8 16 x 16

Problem Size (n x n)

NM

R

CMB (m=1)

CMB (m=8)

CMB (m=16)

Carrier-null (m=1)

Carrier-null (m=8)

Carrier-null (m=16)

Flushing (m=1)

Flushing (m=8)

Flushing (m=16)

Demand-driven (m=1)

Demand-driven (m=8)

Demand-driven (m=16)

J avaSpace [4 procs] (m=1)

J avaSpace [4 procs] (m=8)

J avaSpace [4 procs] (m=16)

J avaSpace [8 procs] (m=1)

J avaSpace [8 procs] (m=8)

J avaSpace [8 procs] (m=16)

CMB

+ Carrier-Null

+ Flushing

+ Demand-driven null msging

JavaSpace (4 procs)

JavaSpace (8 procs)

Page 24: Conservative Simulation using Distributed-Shared Memory

PADS 2002 24

%tage Reduction in NMR:CMB Carrier-null 30%

Flushing incorporated 42%

Demand-driven null msg 55%

JavaSpace (4 processors) 63%

JavaSpace (8 processors) 74%

Performance SummaryPerformance Summary

Page 25: Conservative Simulation using Distributed-Shared Memory

PADS 2002 25

Mprob ni=1 MaxQueueSize(LPi)

Mord ni=1 MaxFELSize(LPi)

Msync ni=1 MaxNullMsgBufferSize(LPi)

Memory RequirementMemory Requirement

Page 26: Conservative Simulation using Distributed-Shared Memory

PADS 2002 26

Memory RequirementMemory Requirement

Space Usage0.2 0.4 0.6 0.8 1 8 16

Mprob 98 192 320 740 256 2048 4096

Mord 50 52 54 56

Msy nc (RMI) 331 341 348 352 665 651 638

Msy nc (JavaSpaces) 305 308 311 312 347 332 317M (RMI) 479 585 722 1148 921 2699 4734M (JavaSpaces) 453 552 685 1108 603 2380 4413

PIPELINE (16, p) PHOLD (16x16, m)mp

Page 27: Conservative Simulation using Distributed-Shared Memory

PADS 2002 27

Achievements & ConclusionAchievements & Conclusion

Enhanced the performance of SPaDES/Java through various synchronization protocols, achieving an excellent NMR of < 30%.

Implemented a brand new discrete-event simulation library based on the concept of shared memory in a JavaSpace.

Implemented a TSA into SPaDES/Java that can be used as a bench for memory usage studies in parallel simulations.

Page 28: Conservative Simulation using Distributed-Shared Memory

PADS 2002 28

AcknowledgmentsAcknowledgments

Port of Singapore Authority (PSA)Ministry of Education, SingaporeConstructive feed-back from referees

Page 29: Conservative Simulation using Distributed-Shared Memory

PADS 2002 29

ReferencesReferences

SPaDES/Java homepagehttp://www.comp.nus.edu.sg/~pasta/spades-java/spadesJava.html

Current project webpagehttp://www.comp.nus.edu.sg/~ngyewkwo/HYP.html

MSG homepagehttp://www.comp.nus.edu.sg/~rpsim/MSG