source level debugging of parallel programs

45
Source Level Debugging Source Level Debugging of Parallel Programs of Parallel Programs Roland Wismüller LRR-TUM, TU München Germany

Upload: nanji

Post on 14-Jan-2016

33 views

Category:

Documents


3 download

DESCRIPTION

Source Level Debugging of Parallel Programs. Roland Wismüller LRR-TUM, TU München Germany. Outline. Introduction: source level debuggers Debuggers for parallel programs Current / future work at LRR-TUM. What is a Debugger?. A tool to remove bugs? No! A tool to find bugs? No! - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Source Level Debugging of Parallel Programs

Source Level Debugging of Source Level Debugging of Parallel ProgramsParallel Programs

Roland Wismüller

LRR-TUM, TU München

Germany

Page 2: Source Level Debugging of Parallel Programs

OutlineOutline

• Introduction: source level debuggers

• Debuggers for parallel programs

• Current / future work at LRR-TUM

Page 3: Source Level Debugging of Parallel Programs

What is a Debugger?What is a Debugger?

• A tool to remove bugs?– No!

• A tool to find bugs?– No!

• A tool to examine program executions?– Yes!

Page 4: Source Level Debugging of Parallel Programs

Source Level DebuggingSource Level Debugging

Page 5: Source Level Debugging of Parallel Programs

Compilation ExampleCompilation Example

Page 6: Source Level Debugging of Parallel Programs

Setting a BreakpointSetting a Breakpoint

Page 7: Source Level Debugging of Parallel Programs

Setting a BreakpointSetting a Breakpoint

Page 8: Source Level Debugging of Parallel Programs

Printing a VariablePrinting a Variable

Page 9: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

replace trap withoriginal instruction

Page 10: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

Page 11: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

insert trap again

Page 12: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

continue execution

Page 13: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

Problem:• there may be no support for single stepping

Page 14: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

replace next instructionwith a trap

Page 15: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,sptrap

continue execution

Page 16: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,sptrap

insert originaltrap & instruction

Page 17: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

continue execution

Still a problem:• original instruction may be a jump / call / ret• we have to emulate these instructions!

Page 18: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

A different problem:• multithreading: another thread may bypass our breakpoint

Page 19: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

A different problem:• multithreading: another thread may bypass our breakpoint

Page 20: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

A different problem:• multithreading: another thread may bypass our breakpoint

Page 21: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call fooadd #4,spmove r0,r1

execute a singlestep

A different problem:• multithreading: another thread may bypass our breakpoint

Page 22: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

Solution:• don’t remove the trap• execute original instruction somewhere else

add #4,sp

Page 23: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

Solution:• don’t remove the trap• execute original instruction somewhere else

add #4,sp

Page 24: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

Solution:• don’t remove the trap• execute original instruction somewhere else

add #4,sp

Page 25: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

call footrapmove r0,r1

Solution:• don’t remove the trap• execute original instruction somewhere else

add #4,sp

Page 26: Source Level Debugging of Parallel Programs

Continue ExecutionContinue Execution

4) cont must execute original instruction

Still a problem:• instruction may depend on the PC value• we have to emulate these instructions!

call footrapmove r0,r1

add #4,sp

Page 27: Source Level Debugging of Parallel Programs

Optimization EffectsOptimization Effects

Page 28: Source Level Debugging of Parallel Programs

Optimization EffectsOptimization Effects

print i

reads i5

variable table

i register i5

shortprints z !!

Page 29: Source Level Debugging of Parallel Programs

Parallel DebuggingParallel Debugging

• Additional properties of parallel programs

• Requirements for parallel debuggers

• Problems and solution techniques

Page 30: Source Level Debugging of Parallel Programs

Parallel ProgramsParallel Programs

• Multiple processes and/or threads– created dynamically– many of them

• Program distributed across several hosts

• Additional state components:– communication subsystem

Page 31: Source Level Debugging of Parallel Programs

Multiple Processes / ThreadsMultiple Processes / Threads

• Naming processes / threads– system id’s

• may not be unique, not persistent• not user friendly

– debugger generated id’s• usually: small integers• selection based on additional information

– naming not yet existent processes / threads

• DETOP: pattern matching

Page 32: Source Level Debugging of Parallel Programs

Thread Selection in DETOP Thread Selection in DETOP

function executable system iddebugger id

node list selection pattern

Page 33: Source Level Debugging of Parallel Programs

ScalabilityScalability

• Input: use process / thread sets– commands are executed for each member– e.g. [1,2,3] print i

or [2,7] break 123– sometimes: named sets– problems:

• command semantics may differ for the processese.g. different executables / call stacks

• when to evaluate named sets?

Page 34: Source Level Debugging of Parallel Programs

DETOP User InterfaceDETOP User Interface

Page 35: Source Level Debugging of Parallel Programs

• Output: aggregation– simple case: aggregate identical results

– complex case: aggregate partially identical results

– impossible cases: asynchronous events

ScalabilityScalability

[1]: 12.3[2]: 4.1[3]: 12.3[4]: 12.3[5]: 12.3

[1,3-5]: 12.3[2]: 4.1

Page 36: Source Level Debugging of Parallel Programs

Aggregating Stacks: Call TreeAggregating Stacks: Call Tree

Page 37: Source Level Debugging of Parallel Programs

Concurrency IssuesConcurrency Issues

• What happens if a thread stops?– stop all threads in all processes– stop all threads in the same process– stop only that thread

• What happens if I continue a thread?– start all threads in all processes– start all threads in the same process– start only that thread

• When does the debugger accept input?– only when all processes are stopped– always

Page 38: Source Level Debugging of Parallel Programs

Concurrency IssuesConcurrency Issues

• What happens if a thread stops?– stop all threads in all processes (BP option)– stop all threads in the same process (BP option)– stop only that thread

• What happens if I continue a thread?– start all threads in all processes (separate command)– start all threads in the same process (use pattern)– start only that thread

• When does the debugger accept input?– only when all processes are stopped– always

Page 39: Source Level Debugging of Parallel Programs

Additional State ComponentsAdditional State Components

• E.g. message buffers, blocked processes

• Usually no support from debuggers– additional dependency on programming

library implementation

• Often other tools (visualizers) will show this information– use them together with the debugger (?)

interoperable tools

Page 40: Source Level Debugging of Parallel Programs

Interoperable ToolsInteroperable Tools

• Multiple, loosely coupled tools are used on the same program

• Concrete scenario:– debugger that allows to ‘time-warp’– i.e. return to previous program states without

rerunning the program– speed up debugging cycle of long running

programs

Page 41: Source Level Debugging of Parallel Programs

‘‘Time-Warp’ DebuggerTime-Warp’ Debugger

• Tools that need to interoperate:– parallel debugger (DETOP)– checkpointing system for parallel programs

(CoCheck, based on Condor)– deterministic execution controller (codex)– means to specify the state to return to

(VISTOP: state based program flow visualizer)

Page 42: Source Level Debugging of Parallel Programs

Preconditions for InteroperabilityPreconditions for Interoperability

• Common monitoring infrastructure– OMIS / OCM

• Mechanisms for informing tools on modifications of state done by other tools– e.g. VISTOP must know when DETOP stops a

process, as event buffer must be read

• Mechanisms for direct tool interaction– e.g. VISTOP to CoCheck: ‘restart from

checkpoint’

Page 43: Source Level Debugging of Parallel Programs

OMISOMIS

• Basis:– objects + services– event / action paradigm– scalability by using object sets– location transparency

• Example:thread_creates_proc([t_1,t_2]):thread_stop([$proc, $new_proc])thread_get_backtrace([$thread],0)

Page 44: Source Level Debugging of Parallel Programs

Interoperability ProblemsInteroperability Problems

• A tool may violate preconditions of another tool– DETOP can stop a process– checkpointing is initiated by sending a signal– stopped process won’t handle signal !– we cannot hide the state change from the

checkpointer

this case cannot be handled easily

Page 45: Source Level Debugging of Parallel Programs

The EndThe End

• Debuggers are by far not trivial

• Parallel debuggers are even more complex

• Lots of open (maybe unsolvable) research issues

• Interoperability may ease implementation of enhanced functionality