pram (parallel random access machine)

22
PRAM PRAM (Parallel Random Access (Parallel Random Access Machine) Machine) David Rodriguez-Velazquez Spring -09 CS-6260 Dr. Elise de Doncker

Upload: urban

Post on 13-Jan-2016

62 views

Category:

Documents


0 download

DESCRIPTION

PRAM (Parallel Random Access Machine). David Rodriguez-Velazquez Spring -09 CS-6260 Dr. Elise de Doncker. Overview. What is a machine model? Why do we need a model? RAM PRAM Steps in computation Write conflict Examples. A parallel Machine Model. What is a machine model? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PRAM (Parallel Random Access Machine)

PRAMPRAM(Parallel Random Access (Parallel Random Access Machine)Machine)

David Rodriguez-VelazquezSpring -09 CS-6260

Dr. Elise de Doncker

Page 2: PRAM (Parallel Random Access Machine)

OverviewOverviewWhat is a machine model?Why do we need a model?RAMPRAM

◦Steps in computation◦Write conflict◦Examples

Page 3: PRAM (Parallel Random Access Machine)

A parallel Machine ModelA parallel Machine ModelWhat is a machine model?

◦Describes a “machine”◦Puts a value to the operations on the

machineWhy do we need a model?

◦Makes it easy to reason algorithms◦Achieve complexity bounds◦Analyzes maximum parallelism

Page 4: PRAM (Parallel Random Access Machine)

RAM (Random Access RAM (Random Access Machine)Machine)Unbounded number of local memory

cellsEach memory cell can hold an integer

of unbounded sizeInstruction set included –simple

operations, data operations, comparator, branches

All operations take unit timeTime complexity = number of

instructions executedSpace complexity = number of

memory cells used

Page 5: PRAM (Parallel Random Access Machine)

PRAMPRAM (Parallel Random Access (Parallel Random Access Machine)Machine)Definition:

◦Is an abstract machine for designing the algorithms applicable to parallel computers

◦M’ is a system <M, X, Y, A> of infinitely many RAM’s M1, M2, …, each Mi is called a processor

of M’. All the processors are assumed to be identical. Each has ability to recognize its own index i

Input cells X(1), X(2),…, Output cells Y(1), Y(2),…, Shared memory cells A(1), A(2),…,

Page 6: PRAM (Parallel Random Access Machine)

PRAM (Parallel RAM)PRAM (Parallel RAM)Unbounded collection of RAM

processors P0, P1, …,Processors don’t have tapeEach processor has unbounded

registersUnbounded collection of share

memory cellsAll processors can access all

memory cells in unit timeAll communication via shared

memory

Page 7: PRAM (Parallel Random Access Machine)

PRAM (step in a PRAM (step in a computation)computation)

Consist of 5 phases (carried in parallel by all the processors) each processor:◦Reads a value from one of the cells x(1),…, x(N)◦Reads one of the shared memory cells A(1),

A(2),…◦Performs some internal computation◦May write into one of the output cells y(1), y(2),

…◦May write into one of the shared memory cells

A(1), A(2),…e.g. for all i, do A[i] = A[i-1] + 1;

Read A[i-1] , compute add 1, write A[i]happened synchronously

Page 8: PRAM (Parallel Random Access Machine)

PRAM (Parallel RAM) PRAM (Parallel RAM) Some subset of the processors

can remain idle

P0 P1 P2 PN

Shared Memory Cells

Two or more processors may read simultaneously from the same cell

A write conflict occurs when two or more processors try to write simultaneously into the same cell

Page 9: PRAM (Parallel Random Access Machine)

Share Memory Access Share Memory Access ConflictsConflictsPRAM are classified based on their

Read/Write abilities (realistic and useful)◦Exclusive Read(ER) : all processors can

simultaneously read from distinct memory locations

◦Exclusive Write(EW) : all processors can simultaneously write to distinct memory locations

◦Concurrent Read(CR) : all processors can simultaneously read from any memory location

◦Concurrent Write(CW) : all processors can write to any memory location

◦EREW, CREW, CRCW

Page 10: PRAM (Parallel Random Access Machine)

Concurrent Write (CW)Concurrent Write (CW)What value gets written finally?

◦Priority CW: processors have priority based on which value is decided, the highest priority is allowed to complete WRITE

◦Common CW: all processors are allowed to complete WRITE iff all the values to be written are equal.

◦Arbitrary/Random CW: one randomly chosen processor is allowed to complete WRITE

Page 11: PRAM (Parallel Random Access Machine)

Strengths of PRAMStrengths of PRAMPRAM is attractive and important model

for designers of parallel algorithms Why?◦ It is natural: the number of operations

executed per one cycle on p processors is at most p

◦ It is strong: any processor can read/write any shared memory cell in unit time

◦ It is simple: it abstracts from any communication or synchronization overhead, which makes the complexity and correctness of PRAM algorithm easier

◦ It can be used as a benchmark: If a problem has no feasible/efficient solution on PRAM, it has no feasible/efficient solution for any parallel machine

Page 12: PRAM (Parallel Random Access Machine)

Computational powerComputational powerModel A is computationally stronger

than model B (A>=B) iff any algorithm written for B will run unchanged on A in the same parallel time and same basic properties.

Priority >= Arbitrary >= Common >=CREW >= EREW

Most Leastpowerful powerful

Least Mostrealistic realistic

Page 13: PRAM (Parallel Random Access Machine)

An initial exampleAn initial exampleHow do you add N numbers

residing in memory location M[0, 1, …, N]

Serial Algorithm = O(N)

PRAM Algorithm using N processors P0, P1, P2, …, PN ?

Page 14: PRAM (Parallel Random Access Machine)

PRAM Algorithm PRAM Algorithm (Parallel (Parallel Addition)Addition)

P0 + P1 + P2 + P3 +

P2 +P0 +

P0 + Step 3

Step 2

Step 1

Page 15: PRAM (Parallel Random Access Machine)

PRAM Algorithm PRAM Algorithm (Parallel (Parallel Addition)Addition)Log (n) steps = time neededn / 2 processors neededSpeed-up = n / log(n)Efficiency = 1 / log(n)Applicable for other operations

◦+, *, <, >, etc.

Page 16: PRAM (Parallel Random Access Machine)

Example 2Example 2p processor PRAM with n numbers

(p<=n)Does x exist within the n numbers?P0 contains x and finally P0 has to

know Algorithm

◦Inform everyone what x is◦Every processor checks [n/p] numbers

and sets a flag◦Check if any of the flags are set to 1

Page 17: PRAM (Parallel Random Access Machine)

Example 2Example 2

EREW CREWCRCW

(common)

Inform everyone what x is

log(p) 1 1

Every processor checks [n/p] numbers and sets a flag

n/p n/p n/p

Check if any of the flag are set to 1

log(p) log(p) 1

Page 18: PRAM (Parallel Random Access Machine)

Some variants of PRAMSome variants of PRAMBounded number of shared memory

cells. Small memory PRAM (input data set exceeds capacity of the share memory i/o values can be distributed evenly among the processors)

Bounded number of processor Small PRAM. If # of threads of execution is higher, processors may interleave several threads.

Bounded size of a machine word. Word size of PRAM

Handling access conflicts. Constraints on simultaneous access to share memory cells

Page 19: PRAM (Parallel Random Access Machine)

LemmaLemmaAssume p’<p. Any problem that can be

solved for a p processor PRAM in t steps can be solved in a p’ processor PRAM in t’ = O(tp/p’) steps (assuming same size of shared memory)

Proof: Partition p is simulated processors into p’ groups

of size p/p’ each Associate each of the p’ simulating processors

with one of these groups Each of the simulating processors simulates one

step of its group of processors by:◦ executing all their READ and local computation

substeps first◦ executing their WRITE substeps then

Page 20: PRAM (Parallel Random Access Machine)

LemmaLemma Assume m’<m. Any problem that can be solved for a p processor

and m-cell PRAM in t steps can be solved on a max(p,m’)-processors m’-cell PRAM in O(tm/m’) steps

Proof: Partition m simulated shared memory cells into m’ continuous segments Si of size

m/m’ each Each simulating processor P’i 1<=i<=p, will simulate processor Pi of the original

PRAM Each simulating processor P’i 1<=i<=m’, stores the initial contents of Si into its local

memory and will use M’[i] as an auxiliary memory cell for simulation of accesses to cell of Si

Simulation of one original READ operationEach P’i i=1,…,max(p,m’) repeats for k=1,…,m/m’1. write the value of the k-th cell of Si into M’[i] i=1…,m’,2. read the value which the simulated processor Pi i=1,…,,p, would

read in this simulated substep, if it appeared in the shared memory

The local computation substep of Pi i=1..,p is simulated in one step by P’i

Simulation of one original WRITE operation is analogous to that of READ

Page 21: PRAM (Parallel Random Access Machine)

ConclusionsConclusionsWe need some model to reason,

compare, analyze and design algorithmsPRAM is simple and easy to understandRich set of theoretical resultsOver-simplistic and often not realisticThe programs written on these

machines are, in general, of type MIMD. Certain special cases such as SIMD may also be handled in such a framework

Page 22: PRAM (Parallel Random Access Machine)

QuestionQuestionWhy is PRAM attractive and important

model for designers of parallel algorithms ?◦ It is natural: the number of operations

executed per one cycle on p processors is at most p

◦ It is strong: any processor can read/write any shared memory cell in unit time

◦ It is simple: it abstracts from any communication or synchronization overhead, which makes the complexity and correctness of PRAM algorithm easier

◦ It can be used as a benchmark: If a problem has no feasible/efficient solution on PRAM, it has no feasible/efficient solution for any parallel machine