lecture 3: designing parallel programs. methodological design designing and building parallel...

Post on 18-Jan-2018

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Methodological Design Partitioning Communication Agglomeration Mapping

TRANSCRIPT

Lecture 3:Lecture 3:

Designing Parallel Designing Parallel ProgramsPrograms

Methodological Design

Designing and Building Parallel Programs

by Ian Foster

www-unix.mcs.anl.gov/dbpp

Methodological Design

Partitioning Communication Agglomeration   Mapping 

Methodological Design

PROBLEM

Methodological Design Partitioning

The computation that is to be performed and the data operated on by this computation are decomposed into small tasks.

Practical issues such as the number of processors in the target computer are ignored, and attention is focused on recognizing opportunities for parallel execution.

PROBLEMPartitioning

Partitioning

Functional Decomposition

Data Decomposition

Partitioning

Functional Decomposition

Partitioning

Data Decomposition

Methodological DesignSingle Program Multiple Data (SPMD) programming model

A fixed number of identical tasks are created at program startup. Each task executes the same program but operates on different data.

Methodological Design Communication

The communication required to coordinate task execution is determined, and appropriate communication structures and algorithms are defined.

PROBLEMPartitioning

Communication

Communication

Latency (L): How long does it take to start sending a "message"? (in microseconds)

Bandwidth (B): What data rate can be sustained once the message is started? (in Mbytes/sec)

Transfer time = L + (1/B) * data size

Communication Local Communication

Xi,j = (4 Xi,j + Xi-1,j + Xi+1,j + Xi,j-1 + Xi,j+1 )/8

Xi-1,j

Xi,jXi,j-1 Xi,j+1

Xi+1,j

5-point Stencil

Communication Global Communication

n

∑Xi i=0

X0 X1 X2 X3 X4 X5 X6 X7

∑01 ∑2

3 ∑45 ∑6

7

∑07

∑03 ∑4

7

Communication

Static CommunicationCommunicating tasks do not change over time.

Dynamic CommunicationCommunicating tasks are determined by data computed at run time.

Communication

Dynamic Communication

B[i]=A[ k[i] ]k 1 2 1 3 4

A 10 15 20 25 30

B 10 15 10 20 25

Methodological Design Agglomeration

The task and communication structures defined in the first two stages of a design are evaluated with respect to performance requirements and implementation costs.

If necessary, tasks are combined into larger tasks to improve performance or to reduce development costs.

PROBLEMPartitioning

Communication

Agglomeration

Agglomeration

Agglomeration

Agglomeration

Goals are:

Increasing Granularity (fine grain / course grain)

Decreasing communication/computation ratio

AgglomerationXi,j = (4 Xi,j + Xi-1,j + Xi+1,j + Xi,j-1 + Xi,j+1 )/8

AgglomerationXi,j = (4 Xi,j + Xi-1,j + Xi+1,j + Xi,j-1 + Xi,j+1 )/8

Agglomeration

Methodological Design Mapping

Each task is assigned to a processor in a manner that minimizes the total execution time.

Mapping can be specified statically or determined at runtime by load-balancing algorithms.

PROBLEMPartitioning

Communication

Agglomeration

Mapping

MappingIf each task performs the same amount of computation

and communicates:

Cyclic Mapping

MappingThe goal of mapping algorithms is to minimize total

execution time.

Mapping algorithms attempt to satisfy the competing goals of:

maximizing processor utilization minimizing communication costs.

Mapping is NP-complete

MappingTwo strategies to achieve this goal:

Place tasks that are able to execute concurrently on different processors

Place tasks that communicate frequently on the same processor

Mapping

Consider the topology

Methodological Design

Case Study: Atmosphere

Model

Methodological Design

Case Study: Atmosphere

Model

Methodological Design

Case Study: Atmosphere

Model

Methodological Design

Case Study: Atmosphere

Model

PartitioningNx x Ny x Nz tasks

Methodological Design

Case Study: Atmosphere

Model

Communication

9-point stencil in horizontal dimension3-point stencil in vertical dimension

Methodological Design

Case Study: Atmosphere

Model

Communication

Computations

1. Global operations

2. Physics calculations

where      denotes the mass at grid point (i,j,k) .

The total clear sky (TCS) at level k≥1 is defined as

where level 0 is the top of the atmosphere and

cldi is the cloud fraction at level i . In total, the physics component of the model requires on the order of 30 communications per grid point and per time step

Methodological Design

Case Study: Atmosphere

Model

Communicationobtains data from eight other tasks

Methodological Design

Case Study: Atmosphere

Model

Communicationobtains data from eight other tasks

only 4 communications are required per task

Methodological Design

Case Study: Atmosphere

Model

AgglomerationVertical dimension requires communication (2 messages) also (30 messages) for various ``physics'' computations

These communications can be avoided by agglomerating tasks within each vertical column.

Therefore tasks will be created.

Methodological Design

Case Study: Atmosphere

Model

Mapping

Methodological Design

Methodological Design

Agglomeration

Methodological Design

Increase Granularity Decrease communication/computation ratio

top related