scheduling considerations for building dynamic verification tools for mpi

22
Scheduling Considerations for building Dynamic Verification Tools for MPI Sarvani Vakkalanka, Michael DeLisi Ganesh Gopalakrishnan, Robert M. Kirby School of Computing, University of Utah, Salt Lake City Supported by Microsoft HPC Institutes, NSF CNS-0509379 http://www.cs.utah.edu/ 1

Upload: lou

Post on 22-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Scheduling Considerations for building Dynamic Verification Tools for MPI. Sarvani Vakkalanka , Michael DeLisi Ganesh Gopalakrishnan , Robert M. Kirby School of Computing, University of Utah, Salt Lake City Supported by Microsoft HPC Institutes, NSF CNS-0509379 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scheduling Considerations for building  Dynamic Verification Tools for MPI

1

Scheduling Considerations for building Dynamic Verification Tools for MPI

Sarvani Vakkalanka, Michael DeLisiGanesh Gopalakrishnan, Robert M. Kirby

School of Computing, University of Utah, Salt Lake City

Supported by Microsoft HPC Institutes,

NSF CNS-0509379

http://www.cs.utah.edu/formal_verification

Page 2: Scheduling Considerations for building  Dynamic Verification Tools for MPI

2

(BlueGene/L - Image courtesy of IBM / LLNL)

(Image courtesy of Steve Parker, CSAFE, Utah)

The scientific community is increasingly employing expensive supercomputers that employ distributed programming libraries….

…to program large-scale simulations in all walks of science, engineering, math, economics, etc.

Background

Page 3: Scheduling Considerations for building  Dynamic Verification Tools for MPI

3

Current Programming Realities

Code written using mature libraries (MPI, OpenMP, PThreads, …)

API calls made from real programming languages

(C, Fortran, C++)

Runtime semantics determined by realistic Compilers and Runtimes

How best to verifycodes that will run on actual platforms?

Page 4: Scheduling Considerations for building  Dynamic Verification Tools for MPI

4

Classical Model Checking

Finite State Model of

Concurrent Program

Check Properties

Extraction of Finite State Models for realistic programs is difficult.

Page 5: Scheduling Considerations for building  Dynamic Verification Tools for MPI

5

Dynamic Verification

ActualConcurrent

Program Check Properties

Avoid model extraction which can be tedious and imprecise

Program serves as its own model

Reduce Complexity through Reduction of interleavings (and other methods)

Page 6: Scheduling Considerations for building  Dynamic Verification Tools for MPI

6

Dynamic Verification

ActualConcurrent

Program Check Properties

One Specific Test Harness

Need test harness in order to run the code.

Will explore ONLY RELEVANT INTERLEAVINGS (all Mazurkeiwicz traces) for the given test harness

Conventional testing tools cannot do this !!

E.g. 5 threads, 5 instructions each 1010 interleavings !!

Page 7: Scheduling Considerations for building  Dynamic Verification Tools for MPI

7

Dynamic Verification

ActualConcurrent

Program Check Properties

One Specific Test Harness

Need to consider all test harnesses

FOR MANY PROGRAMS, this number seems small (e.g. Hypergraph Partitioner)

Page 8: Scheduling Considerations for building  Dynamic Verification Tools for MPI

8

Related Work

• Dynamic Verification tool:– CHESS – Verisoft (POPL ’97)– DPOR (POPL’ 05)– JPF

• ISP is similar to CHESS and DPOR

Page 9: Scheduling Considerations for building  Dynamic Verification Tools for MPI

9

Dynamic Partial Order Reduction (DPOR)

P0 P1 P2

lock(x)

…………..

unlock(x)

lock(x)

…………..

unlock(x)

lock(x)

…………..

unlock(x)

L0

U0

L1

L2

U1

U2

L0

U0

L2

U2

L1

U1

Page 10: Scheduling Considerations for building  Dynamic Verification Tools for MPI

10

Executable

Proc1

Proc2

……Procn

SchedulerRun

MPI Runtime

ISP

Manifest only/all relevant interleavings (DPOR)

Manifest ALL relevant interleavings of the MPI Progress Engine : - Done by DYNAMIC REWRITING of WILDCARD Receives.

MPI Program

Profiler

Page 11: Scheduling Considerations for building  Dynamic Verification Tools for MPI

11

Using PMPI

Scheduler

MPI Runtime

P0’s Call Stack

User_Function

MPI_Send SendEnvelopeP0: MPI_Send

TCP socket

PMPI_Send

In MPI RuntimePMPI_Send

MPI_Send

Page 12: Scheduling Considerations for building  Dynamic Verification Tools for MPI

12

DPOR and MPI

Implemented an Implicit deadlock detection technique form a single program trace.

Issues with MPI progress engine for wildcard receives could not be resolved.

More details can be found in our CAV’2008 paper: “Dynamic Verification of MPI Programs with Reductions in Presence of

Split Operations and Relaxed Orderings”

Page 13: Scheduling Considerations for building  Dynamic Verification Tools for MPI

13

POE P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

sendNext Barrier

Page 14: Scheduling Considerations for building  Dynamic Verification Tools for MPI

14

P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

sendNextBarrier

Irecv(*)

Barrier

POE

Page 15: Scheduling Considerations for building  Dynamic Verification Tools for MPI

15

P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

Barrier

Irecv(*)

Barrier

Barrier

Barrier

Barrier

Barrier

POE

Page 16: Scheduling Considerations for building  Dynamic Verification Tools for MPI

16

P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

Barrier

Irecv(*)

Barrier

Barrier

Wait (req)

Recv(2)

Isend(1)

SendNext

Wait (req)

Irecv(2)Isend

Wait

No Match-Set

Deadlock!

POE

Page 17: Scheduling Considerations for building  Dynamic Verification Tools for MPI

17

MPI_Waitany + POE

P0 P1 P2

MPI Runtime

Scheduler

Barrier

Recv(0)

Recv(0)

Barrier

Isend(1, req[0])

Waitany(2,req)

Isend(2, req[1])

Barrier

Isend(1, req[0])sendNextIsend(2, req[0])

sendNext

Waitany(2, req)

Recv(0)

Barrier

Page 18: Scheduling Considerations for building  Dynamic Verification Tools for MPI

18

P0 P1 P2

MPI Runtime

Scheduler

Barrier

Recv(0)

Recv(0)

Barrier

Isend(1, req[0])

Waitany(2,req)

Isend(2, req[1])

Barrier

Isend(1, req[0])

Isend(2, req[0])

Waitany(2, req)

Recv(0)

Barrier

Isend(1,req[0])

Recv

Barrier

Error! req[1] invalid

Valid

Invalid

req[0]

req[1] MPI_REQ_NULL

MPI_Waitany + POE

Page 19: Scheduling Considerations for building  Dynamic Verification Tools for MPI

19

MPI Progress Engine Issues

P0 P1

MPI Runtime

Scheduler

Isend(0, req)

Wait(req)

Irecv(1, req)

Wait(req)

Barrier

Irecv(1, req)

Barrier

Wait

Isend(0, req)Barrier

sendNext

sendNext

PMPI_Wait

Does not Return

Scheduler Hangs

PMPI_Irecv + PMPI_Wait

Page 20: Scheduling Considerations for building  Dynamic Verification Tools for MPI

20

Experiments ISP was run on 69 examples of the Umpire test suite.

Detected deadlocks in these examples where tools like Marmot cannot detect these deadlocks.

Produced far smaller number of interleavings compared to those without reduction.

ISP run on Game of Life ~ 500 lines code. ISP run on Parmetis ~ 14k lines of code.

Widely used for parallel partitioning of large hypergraphs ISP run on MADRE

(Memory aware data redistribution engine by Siegel and Siegel, EuroPVM/MPI 08)• Found previously KNOWN deadlock, but AUTOMATICALLY within one

second ! Results available at:

http://www.cs.utah.edu/formal_verification/ISP_Tests

Page 21: Scheduling Considerations for building  Dynamic Verification Tools for MPI

21

Concluding Remarks Tool available (download and try) Future work

Distributed ISP scheduler Handle MPI + Threads Do large-scale bug hunt now that ISP can execute large-

scale codes.

Page 22: Scheduling Considerations for building  Dynamic Verification Tools for MPI

22

Implicit Deadlock Detection

P0 P1 P2

Irecv(*, req)

Recv(2)

Wait(req)

Isend(0, req)

Wait(req)

Isend(0, req)

Wait(req)

MPI Runtime

Scheduler

P0 : Irecv(*)

P1 : Isend(P0)

P2 : Isend(P0)

P0 : Recv(P2)

P1 : Wait(req)

P2 : Wait(req)

P0 : Wait(req)

No Matching Send

Deadlock!