message passing interface (mpi) 2 amit majumdar scientific computing applications group san diego...

37
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School of Mines)

Upload: rodger-hancock

Post on 18-Jan-2018

232 views

Category:

Documents


0 download

DESCRIPTION

3 Review: Six Basic MPI Calls MPI_Init –Initialize MPI MPI_Comm_Rank –Get the processor rank MPI_Comm_Size –Get the number of processors MPI_Send –Send data to another processor MPI_Recv –Get data from another processor MPI_Finalize –Finish MPI

TRANSCRIPT

Page 1: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Message Passing Interface (MPI) 2

Amit MajumdarScientific Computing Applications Group

San Diego Supercomputer CenterTim Kaiser (now at Colorado School of Mines)

Page 2: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

2

MPI 2 Lecture Overview

• Review – 6 Basic MPI Calls– Data Types – Wildcards– Using Status

• Probing• Asynchronous Communication

Page 3: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

3

Review: Six Basic MPI Calls

• MPI_Init– Initialize MPI

• MPI_Comm_Rank– Get the processor rank

• MPI_Comm_Size– Get the number of processors

• MPI_Send– Send data to another processor

• MPI_Recv– Get data from another processor

• MPI_Finalize– Finish MPI

Page 4: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

4

Simple Send and Receive Program

#include <stdio.h>#include <mpi.h>int main(int argc,char *argv[]){ int myid, numprocs, tag,source,destination,count, buffer; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); tag=1234; source=0; destination=1; count=1; if(myid == source){ buffer=5678; MPI_Send(&buffer,count,MPI_INT,destination,tag,MPI_COMM_WORLD); printf("processor %d sent %d\n",myid,buffer); } if(myid == destination){ MPI_Recv(&buffer,count,MPI_INT,source,tag,MPI_COMM_WORLD,&status); printf("processor %d got %d\n",myid,buffer); } MPI_Finalize();}

Page 5: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

5

MPI Data Types

• MPI has many different predefined data types– All your favorite C data types are included

• MPI data types can be used in any communication operation

Page 6: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

6

MPI Predefined Data Types in C

C MPI Types C TypesMPI_CHAR signed charMPI_SHORT signed short intMPI_INT signed intMPI_LONG signed long intMPI_UNSIGNED_CHAR unsigned charMPI_UNSIGNED_SHORT unsigned short intMPI_UNSIGNED unsigned intMPI_UNSIGNED_LONG unsigned long intMPI_FLOAT floatMPI_DOUBLE doubleMPI_LONG_DOUBLE long doubleMPI_BYTE -MPI_PACKED -

Page 7: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

7

MPI Predefined Data Types in Fortran

Fortran MPI Types Fortran Types

MPI_REAL REAL

MPI_INTEGER INTEGER

MPI_CHARACTER CHARACTER

MPI_COMPLEX COMPLEX

MPI_DOUBLE_PRECISION DOUBLE PRECISION

MPI_LOGICAL LOGICAL

MPI_PACKED

Page 8: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

8

Wildcards

• Enables programmer to avoid having to specify a tag and/or source.

• Example:

• MPI_ANY_SOURCE and MPI_ANY_TAGare wild cards

• status structure is used to get wildcard values

MPI_Status status;int buffer[5];int error;error = MPI_Recv(&buffer[0], 5, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD,&status);

Page 9: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

9

Status

• The status parameter returns additional information for some MPI routines– additional error status information – additional information with wildcard parameters

• C declaration : a predefined struct MPI_Status status;

• Fortran declaration : an array is used instead INTEGER STATUS(MPI_STATUS_SIZE)

Page 10: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

10

Accessing Status Information

• The tag of a received message – C : status.MPI_TAG – Fortran : STATUS(MPI_TAG)

• The source of a received message – C : status.MPI_SOURCE – Fortran : STATUS(MPI_SOURCE)

• The error code of the MPI call – C : status.MPI_ERROR – Fortran : STATUS(MPI_ERROR)

• Other uses...

Page 11: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

11

MPI_Probe

• MPI_Probe allows incoming messages to be checked without actually receiving them – The user can then decide how to receive the data – Useful when different action needs to be taken

depending on the "who, what, and how much" information of the message

Page 12: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

12

MPI_Probe

• C MPI_Probe(source, tag, COMM, &status)

• Fortran MPI_Probe(source, tag, COMM, status, ierror)

• Parameters – source: source rank, or MPI_ANY_SOURCE – tag: tag value, or MPI_ANY_TAG– COMM: communicator – status: status object

Page 13: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

13

MPI_Probe Example

#include <stdio.h>#include <mpi.h>#include <math.h>/* Program shows how to use probe and get_count to find the

size of an incomming message */int main(argc,argv)int argc;char *argv[];{ int myid, numprocs; MPI_Status status; int i,mytag,ierr,icount; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid);/* print out my rank and this run's PE size*/ printf("Hello from %d\n",myid); printf("Numprocs is %d\n",numprocs);

Page 14: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

14

MPI_Probe example (cont.)

mytag=123; i=0; icount=0; if(myid == 0) { i=100; icount=1;

ierr=MPI_Send(&i,icount,MPI_INT,1,mytag,MPI_COMM_WORLD); } if(myid == 1){ ierr=MPI_Probe(0,mytag,MPI_COMM_WORLD,&status); ierr=MPI_Get_count(&status,MPI_INT,&icount); printf("getting %d\n",icount); ierr =

MPI_Recv(&i,icount,MPI_INT,0,mytag,MPI_COMM_WORLD, status); printf("i= %d\n",i); } MPI_Finalize();}

Page 15: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

15

MPI_Barrier

• Blocks the caller until all members in the communicator have called it.

• Used as a synchronization tool. • C

MPI_Barrier(COMM )• Fortran

Call MPI_Barier(COMM, Ierror)• Parameter

– Comm: communicator (often MPI_COMM_WORLD)

Page 16: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

16

Asynchronous (non-blocking) Communication

• Asynchronous send: send call returns immediately, send actually occurs later

• Asynchronous receive: receive call returns immediately. When received data is needed, call a wait subroutine

• Asynchronous communication used to overlap communication with computation.

• Can help prevent deadlock (but beware!)

Page 17: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

17

Asynchronous Send with MPI_Isend

• C MPI_Request request MPI_Isend(&buffer, count, datatype, dest,tag, COMM, &request)

• Fortran Integer REQUEST MPI_Isend(buffer, count, datatype, dest, tag, COMM, request, ierror)

• request is a new output parameter• Don't change data until communication is

complete

Page 18: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

18

Asynchronous Receive w/ MPI_Irecv

• C MPI_Request request; int MPI_Irecv(&buf, count, datatype, source, tag, COMM, request)

• Fortran Integer request MPI_Irecv(buffer, count, datatype, source, tag, COMM, request,ierror)

• Parameter changes– Request: communication request– Status parameter is missing

• Don't use data until communication is complete

Page 19: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

19

MPI_Wait Used to Complete Communication

• Request from Isend or Irecv is input– The completion of a send operation indicates that

the sender is now free to update the data in the send buffer

– The completion of a receive operation indicates that the receive buffer contains the received message

• MPI_Wait blocks until message specified by "request" completes

Page 20: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

20

MPI_Wait Usage

• C MPI_Request request; MPI_Status status; MPI_Wait(&request, &status)

• Fortran Integer request Integer status(MPI_STATUS_SIZE) MPI_Wait(request, status, ierror)

• MPI_Wait blocks until message specified by "request" completes

Page 21: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

21

MPI_Test

• Similar to MPI_Wait, but does not block• Value of flags signifies whether a message has

been delivered• C

int flag int MPI_Test(&request,&flag, &status)

• Fortran Logical flag MPI_Test(request, flag, status, ierror)

Page 22: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

22

Non-Blocking Send Example

call MPI_Isend (buffer,count,datatype,dest, tag,comm, request, ierr)10 continue Do other work ...

call MPI_Test (request, flag, status, ierr) if (.not. flag) goto 10

Page 23: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

23

Asynchronous Send and Receive

• Write a parallel program to send and receive data using MPI_Isend and MPI_Irecv– Initialize MPI – Have processor 0 send an integer to processor 1– Have processor 1 receive and integer from

processor 0– Both processors check on message completion – Quit MPI

Page 24: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

24

#include <stdio.h>#include "mpi.h"#include <math.h> /************************************************************This is a simple isend/ireceive program in MPI************************************************************/

int main(argc,argv)int argc;char *argv[];{ int myid, numprocs; int tag,source,destination,count; int buffer; MPI_Status status; MPI_Request request;

Page 25: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

25

MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); tag=1234; source=0; destination=1; count=1; request=MPI_REQUEST_NULL;

Page 26: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

26

if(myid == source){ buffer=5678; MPI_Isend(&buffer,count,MPI_INT,destination,tag, MPI_COMM_WORLD,&request); } if(myid == destination){ MPI_Irecv(&buffer,count,MPI_INT,source,tag, MPI_COMM_WORLD,&request); } MPI_Wait(&request,&status); if(myid == source){ printf("processor %d sent %d\n",myid,buffer); } if(myid == destination){ printf("processor %d got %d\n",myid,buffer); } MPI_Finalize();}

Page 27: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Point to Point Communications in MPI

• Basic operations of Point to Point (PtoP) communication and issues of deadlock

• Several steps are involved in the PtoP communication

• Sending process– Data is copied to the user buffer by the user– User calls one of the MPI send routines– System copies the data from the user buffer to the

system buffer– System sends the data from the system buffer to the

destination processor

Page 28: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Point to Point Communications in MPI

• Receiving process– User calls one of the MPI receive subroutines– System receives the data from the source process, and

copies it to the system buffer– System copies the data from the system buffer to the

user buffer– User uses the data in the user buffer

Page 29: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

sendbuf

Call send routine

Now sendbuf can be reused

Process 0 : User mode

Kernel mode

Copying data from sendbuf to systembuf

Send data from sysbuf to dest

dataProcess 1 : User mode Kernel mode

Call receive routinereceive data from src to systembuf

Copying data from sysbufto recvbuf

sysbuf

sysbuf

recvbuf

Now recvbuf contains valid data

Page 30: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Unidirectional communication

• Blocking send and blocking receiveif (myrank == 0) then

call MPI_Send(…)elseif (myrank == 1) then

call MPI_Recv(….) endif

• Non-blocking send and blocking receiveif (myrank == 0) then

call MPI_ISend(…) …….

call MPI_Wait(…)else if (myrank == 1) then

call MPI_Recv(….) endif

Page 31: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

• Blocking send and non-blocking recv

if (myrank == 0 ) thencall MPI_Send(…..)

elseif (myrank == 1) thencall MPI_Irecv (…)

………call MPI_Wait(…)

endif• Non-blocking send and non-blocking recv

if (myrank == 0 ) thencall MPI_Isend (…)

…………..call MPI_Wait (…)

elseif (myrank == 1) thencall MPI_Irecv (….)

…………………..call MPI_Wait(..)

endif

Unidirectional communication

Page 32: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Bidirectional communication

• Need to be careful about deadlock when two processes exchange data with each other

• Deadlock can occur due to incorrect order of send and recv or due to limited size of the system buffer

sendbuf

recvbuf

Proc: Rank 0 Proc: Rank 1

recvbuf

sendbuf

Page 33: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Bidirectional communication

• Case 1 : both processes call send first, then recv if (myrank == 0 ) then

call MPI_Send(….)call MPI_Recv (…)

elseif (myrank == 1) thencall MPI_Send(….)call MPI_Recv(….)

endif• No deadlock as long as system buffer is larger than send buffer• Deadlock if system buffer is smaller than send buf• If you replace MPI_Send with MPI_Isend and MPI_Wait, it is still the same• Moral : there may be error in coding that only shows up for

larger problem size

Page 34: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Bidirectional communication

• Following is free from deadlock

if (myrank == 0 ) thencall MPI_Isend(….)call MPI_Recv (…)call MPI_Wait(…)

elseif (myrank == 1) thencall MPI_Isend(….)call MPI_Recv(….)call MPI_Wait(….)

endif

Page 35: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Bidirectional communication

• Case 2 : both processes call recv first, then send if (myrank == 0 ) then

call MPI_Recv(….)call MPI_Send (…)

elseif (myrank == 1) thencall MPI_Recv(….)call MPI_Send(….)

endif• The above will always lead to deadlock (even if you replace

MPI_Send with MPI_Isend and MPI_Wait)

Page 36: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Bidirectional communication

• The following code can be safely executed

if (myrank == 0 ) thencall MPI_Irecv(….)call MPI_Send (…)call MPI_Wait(…)

elseif (myrank == 1) thencall MPI_Irecv(….)call MPI_Send(….)call MPI_Wait(….)

endif

Page 37: Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School

Bidirectional communication

• Case 3 : one process call send and recv in this order, and the other calls in the opposite order

if (myrank == 0 ) thencall MPI_Send(….)call MPI_Recv(…)

elseif (myrank == 1) thencall MPI_Recv(….)call MPI_Send(….)

endif• The above is always safe• You can replace both send and recv on both processor with Isend

and Irecv