query-based debugging

57
Query-Based Debugging Raimondas Lencevicius Department of Computer Science, UCSB

Upload: frayne

Post on 19-Jan-2016

65 views

Category:

Documents


1 download

DESCRIPTION

Query-Based Debugging. Raimondas Lencevicius. Department of Computer Science, UCSB. Debugging of OO Programs. Symbolic debugging Control flow debugging Object state monitoring Data breakpoints Conditional breakpoints Debugging of abstract relationships? Complex object relationships. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Query-Based Debugging

Query-Based Debugging

Raimondas Lencevicius

Department of Computer Science, UCSB

Page 2: Query-Based Debugging

2

Debugging of OO Programs

• Symbolic debugging– Control flow debugging– Object state monitoring– Data breakpoints– Conditional breakpoints

• Debugging of abstract relationships?– Complex object relationships

Page 3: Query-Based Debugging

3

Debugging Object Relationships

• Programmers need to find objects violating relationships– “Are there any windows that do not reference

some child widget?”

• Current debuggers provide only low-level views

• Programmers have to write special testing code

Page 4: Query-Based Debugging

4

Goals of Query-Based Debugging

• Make debugging of data structures easier by answering questions about object relationships

• Explore unfamiliar programs

• Find data structure errors as soon as they occur

Page 5: Query-Based Debugging

5

Query-Based Debugging

• Ask common questions about program state

• Quickly access sets of interesting objects

• Check properties of large groups of objects using single query

• Answer queries while program is running

• Provide functionality efficiently

Page 6: Query-Based Debugging

6

Window

Widgets

Program:

Graphical user interface:

window widget1widget collection

parent window

widget2

Windows and Widgets

Page 7: Query-Based Debugging

7

Query Example

• “Are there any windows that do not reference some child widget?”

window widget1widget collection

parent window

Page 8: Query-Based Debugging

8

Talk Overview

• Query case study

• Query model

• Implementation of debugger

• Dynamic queries

• Experimental results

• Future work

• Conclusions

Page 9: Query-Based Debugging

9

Java Compiler - Case Study

• Goal: understand and debug Java subset compiler written for UCSB compiler course

• Variety of queries– “Can the current lexer token refer to an

unitialized token?”– “Can identifiers declared in the same scope

have the same name and type?”– “Can methods have the same name?”

Page 10: Query-Based Debugging

10

Java Compiler - Case Study

• “Can methods have the same name?”• Experiment with input file containing such

methods:…

static int isOne(int c)

{ return 0;}

static int isOne(int c)

{ return 1; }

Page 11: Query-Based Debugging

11

Java Compiler - Case Study

• “Can methods have the same name?”• Debugger gives positive answer

• But not a program error– Compiler finds duplicate methods in later phase

SemanticException: The name `isOne' at line 27 chars 14 to 20 was already declared.

MethodDeclaration

public Id name >> “isOne”…

Code >>…(ReturnStmt,Num"0")...

MethodDeclaration

public Id name >> “isOne”…

Code >>…(ReturnStmt,Num”1")...

Page 12: Query-Based Debugging

12

Java Compiler Example Summary

• Explore unfamiliar program

• Find a possible error– Further program investigation shows that there

is no error

• Use query as invariant to verify program’s execution– Dynamic query

Page 13: Query-Based Debugging

13

Talk Overview

• Query case study

• Query model

• Implementation of debugger

• Dynamic queries

• Experimental results

• Future work

• Conclusions

Page 14: Query-Based Debugging

14

Query Model• Widget wid; Window win.

(wid.window == win) && (! win.widgetCollection.contains(wid))

Search domain

Constraint expression in conjunctive form

• Arbitrary boolean constraint expression• Assumption: side-effect free methods

• Selection and join queries

Page 15: Query-Based Debugging

15

Java Compiler Example

• “Can methods have the same name?”MethodDecl x y.(x.name.spelling == y.name.spelling)&& (x != y)

Page 16: Query-Based Debugging

16

Talk Overview

• Query case study

• Query model

• Implementation of debugger

• Dynamic queries

• Experimental results

• Future work

• Conclusions

Page 17: Query-Based Debugging

17

Static Query Implementation

Query string

Intermediate form Optimized form Generated code

Domain collections

Variable types

Domain sizes

User input

Parser Optimizer

Domaincollector

Code generatorExecution module

GUI Output

Page 18: Query-Based Debugging

18

Overview of Implementation

• Enumeration primitive: finds all instances of domain

• Join ordering: finds good order to evaluate query

• Hash joins: speed up equality constraints

• Incremental delivery: shows first result early

Page 19: Query-Based Debugging

19

Query Execution

(d.contains(m))?

Declaration d

Method m

x1 m2x1 m2

d1 m2

CallExpression ce

(ce.decl == m)?ce1x1 m1

ce1x1 m1ce1d1 m1

“Find all declared methods returning integers and called at least once”

Declaration d; Method m; CallExpression ce.(d.contains(m)) && (ce.decl == m) &&(m.typeName != “int”)

Page 20: Query-Based Debugging

20

Join OrderingInefficient ordering

Efficient ordering

10%

2000 200

10

200

1001%

10%10

20010

200

1001%

Page 21: Query-Based Debugging

21

Join Ordering

• Join execution order significantly influences performancececil_method a b; cecil_formal c d. (a.formals.includes(c)) && (b.formals.includes(d)) && (c.name == d.name) && (a != c) && (b != d)

– Naïve evaluation of Cartesian product is slow– Straightforward order takes 37 seconds– Optimized order takes 6 seconds.

• Problem is NP-complete

• System uses heuristics

Page 22: Query-Based Debugging

22

Hash JoinsNested-loop joins

Hash joins

200

X = Y 20,000 operations

100

X = Y100200

300 operations

Page 23: Query-Based Debugging

23

Incremental Delivery

Declaration d

Method m

x1 m2x1 m2

d1 m2

CallExpression ce

ce1x1 m1ce1x1 m1

ce1d1 m1

• Show first result early by pushing intermediate results through pipeline

(d.contains(m))?

(ce.decl == m)?

Page 24: Query-Based Debugging

24

Incremental Delivery

• Goal: fast response for most queries

• Pipelining– Joins are separate threads connected in pipeline

by limited-size buffers– Thread blocks on empty input or full output– Scheduler prefers threads closer to the end of

pipeline

• Time-slicing– Interrupt “slow” threads and reschedule

Page 25: Query-Based Debugging

25

Talk Overview

• Query case study

• Query model

• Implementation of debugger

• Dynamic queries

• Experimental results

• Future work

• Conclusions

Page 26: Query-Based Debugging

26

Gas Tank - Case Study

• Goal: to debug a gas tank simulation applet

• Inter-object constraints– Molecules should stay inside the gas tank– Molecules should not occupy the same position

Page 27: Query-Based Debugging

27

Gas Tank - Case Study

• Detecting an error is not enough

• What code led to this error?

• Need dynamic queries!

Blue molecule x = 20, y = 25 Red molecule x = 20, y = 25

Page 28: Query-Based Debugging

28

Gas Tank - Case Study

• Dynamic query finds error in Move methodpublic void move() {… x += (int)(v*Math.cos(dtor(dir)));y += (int)(v*Math.sin(dtor(dir))); …

• Fix the errory += (int)(v*Math.sin(dtor(dir)));if collided() then handleCollision();

• But debugger still shows an error• Exclude “atomic” regions

Page 29: Query-Based Debugging

29

Motivation of Dynamic Queries

• Close cause-effect gap between error and its discovery– Errors are reported as soon as they occur

• Display dynamics of objects’ relationships - visualization

• Perform continuous invariant or assertion checks

Page 30: Query-Based Debugging

30

Dynamic Query Implementation

Query Results

Java Program

Query String and Change Set

Custom Class Loader

Standard Java Virtual Machine

CustomDebugger Code

Instrumented Java Program

DebuggerLibrary Code

Page 31: Query-Based Debugging

31

Implementation of Dynamic Queries

• Monitor changes that affect query result

• Invoke debugger when change occurs

• Reevaluate query efficiently - incrementally

Page 32: Query-Based Debugging

32

Change MonitoringMolecule m1, m2.(m1.x == m2.x) && (m1.y == m2.y) && (m1 != m2)

• When to reevaluate?– What to monitor?

• Change set - objects and fields affecting result of query– Domain objects– Referenced fields Molecule <init>, x, y– Objects and fields referenced in methods

Page 33: Query-Based Debugging

33

Instrumentation…x += … ; …

22: iadd

23: putfield 37

26: aload_0

Compile

Load and Instrument

22: iadd

23: invokestatic debug

26: aload_0

Molecule m1, m2.(m1.x == m2.x) && (m1.y == m2.y) &&

(m1 != m2)

public final class DebuggingCode implements RunTimeCode {

public static void debug(Molecule updatedObject, int newValue) { … updatedObject.x = newValue; // replaces putfield 37 QueryTool.runTool(updatedObject); // invokes query evaluator }}

Page 34: Query-Based Debugging

34

Implementation of Monitoring

• Java bytecode instrumented during load time– Custom class loader

– Uses modified class file handling tools from BCA library

• Creation and deletion of domain objects– Creation monitored by instrumenting constructors

– Deletion handled by GC - not implemented yet

• Modification of change set fields– Instrumentation of field assignments

Page 35: Query-Based Debugging

35

Efficient Query Reevaluation

• Same techniques as static queries– Join ordering

– Hash joins

• Incremental reevaluation

• Custom code generation for selection queries

Page 36: Query-Based Debugging

36

Incremental ReevaluationOriginal query: A * B * C

Incremental query: A * B * C

200 200

10

200

10010%

10%1 1

1

10

1001%

1%

Old results

Page 37: Query-Based Debugging

37

Query Reevaluation Optimizations

Molecule m1, m2.(m1.x == m2.x) && (m1.y == m2.y) && (m1 != m2)

• Same value assignments

– Do not change result - no reevaluation required

• Fast selection queries– Lean custom code

… x = 5; …x: 5

Molecule m

Page 38: Query-Based Debugging

38

Talk Overview

• Query case study

• Query model

• Implementation of debugger

• Dynamic queries

• Experimental results

• Future work

• Conclusions

Page 39: Query-Based Debugging

39

Static Query Experiments

• Setup: Sun Ultra 2/200 (200 Mhz UltraSparc) running modified Self 4.0

• Queries– Self GUI– Cecil compiler– Synthetic stress tests

• Different query structures

Page 40: Query-Based Debugging

40

Static Query Evaluation Time20.7

5.9

0

0.5

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Query number

Tim

e (s

ec)

Completion TimeResponse TimeTranslation TimePrimitive Time

Self GUI Cecil comp Points and rectangles

12 x 146 x 370

11K x 4.5K hash join

4.5K x 4.5K

1804 join

Costly selection

Page 41: Query-Based Debugging

41

Discussion of Static Query Experiments

• Most queries take less than a second to execute

• Join ordering heuristic performs well

• Hash joins can speed up execution

• Incremental delivery decreases response time

Page 42: Query-Based Debugging

42

Discussion of Results

• Query 17– 5,000x5,000 = 25,000,000 checks

• Query 18– Complex, large intermediate results

Page 43: Query-Based Debugging

43

Dynamic Query Experiments• Implemented in fully portable Java 1.2• Setup: Sun Ultra 2/2300 (300 Mhz UltraSparc II) running

Sun Solaris Java 1.2 with JIT compiler

• Queries– Gas tank

– Decaf compiler

– SPECjvm98 applications:

– Synthetic stress test microbenchmarks

• Jess expert system

• compress

• Ray tracer

Page 44: Query-Based Debugging

44

Program Slowdown - Selections

• Overhead does not depend on domain size

• Query 4:z.OutCnt < 0Queries 5-6: z.count() < 0,Query 7:z.costlyMathCount(0)

• Query 12: point.radialDistanceGreaterThan(100M)

1 2 3 4 5 6 7 8 9 10 11 120

0.5

1

1.5

2

2.5

3

3.5

Slo

wd

ow

n

Query number

5.83

Decaf

Gas tank

Jess

Compress

Ray tracer

Invocation frequency

1.9M/s

2.3M/s

Page 45: Query-Based Debugging

45

Program Slowdown - Joins

• Practical for infrequent invocations

Size Slowdown Invocationfrequency

Gas tank 33x33 hash join 2.13 54K

Decaf 120Kx600 hash join 3.43 25K

Ray tracer 85Kx8K hash join 229 350K

Compress 1x1 hash join 157 1.5M

Compress 1x1 join 77 2.6M

Micro benchmark 1x20 hash join 228 40M

Microbenchmark 1x20 join 930 42M

Page 46: Query-Based Debugging

46

Discussion of Dynamic Query Experiments

• Selections are efficient

• Join queries practical for infrequent evaluations and small query domains

• Can we predict debugger performance for wide class of queries?– Query execution model

Page 47: Query-Based Debugging

47

Performance Model

Tinstrumented = Toriginal (1 + Tevaluate * Fevaluate)

• Slowdown depends on– Frequency of debugger invocations

– Selections: Tevaluate = 131 ns - 4.26 s

– Joins: Tevaluate = 5.7 s - 546 s

Page 48: Query-Based Debugging

48

Field Assignment Frequencies

• Microbenchmark: 40M assignments per second• SPECjvm98 suite

– Max frequency: 1.9M assignments per second in compress

– 95% fields have < 100K assignments per second

0.1

0.5 1 5

10

50

10

05

00

10

00

50

00

10

K5

0K

10

0K

50

0K

1M

2M

0

10

20

30

40

50

60

70

80

90

100

Cu

mu

lativ

e p

erc

en

tag

e o

f fie

lds

Field assignment frequency

0.1

0.5 1 51

05

01

00

50

01

00

05

00

01

0K

50

K1

00

K5

00

K1

M 2M

0

50

100

150

200

250N

um

be

r o

f fie

lds

Field assignment frequency

Page 49: Query-Based Debugging

49

Selection Slowdown Estimates

• 500K assignments per second

– 6.5% overhead for Tevaluate = 130 ns

– 313% overhead for Tevaluate = 4.26 s

• 95% fields have < 100K assignments per second

– 43% overhead for 4.26 s selection constraints

0.1

0.5 1 5

10

50

10

0

50

0

10

00

50

00

10

K

50

K

10

0K

50

0K

1M

2M

0

1

2

3

4

5

6

7

8

9

10

Slo

wd

ow

n

Field assignment frequency

Low cost

High cost

Page 50: Query-Based Debugging

50

Summary of Dynamic Queries

• Selection queries are efficient– Less than factor 2 slowdown in experiments

including stress tests– Projected less than 43% overhead for most

selection queries

• Join queries are efficient for infrequent evaluations– 2-930 factor slowdown on join queries

Page 51: Query-Based Debugging

51

Related Work• Extensions to symbolic debuggers

– Limited queries on objects [Sefika et al., Hart et al.]

– Script based visualization of data structures [Duel]

– Data structure animation [HotWire]

– Instance filtering and reference visualization [Look!, DDD]

– Method call visualization [Program Explorer, Object Visualizer]

• Rule-based extensions of OO languages [R++]

• Software visualization [Balsa-Zeus, Tango-Polka, Pavane]

• Database query optimization [Ibaraki and Kameda, Krishnamurthy et al., Swami and Iyer]

Page 52: Query-Based Debugging

52

Future Work• Functionality extensions

– Support for projection, arbitrary computations– Supporting on-the-fly debugging– Distributed query-based debugging– Safe update points

• Execution optimizations– Delaying monotonic updates– Lookup caches

Page 53: Query-Based Debugging

53

Conclusions• New approach to debugging

– Quick access to sets of interesting objects

– Efficient way to check properties of large groups of objects using single query

– Instant error alert with dynamic queries

• Good performance– Most static queries execute in one or two seconds

– Most dynamic selection queries slow down programs less than 43%

Page 54: Query-Based Debugging

54

Further Information

• Query-Based Debugginghttp://www.cs.ucsb.edu/~raimisl/DQBD.html

OOPSLA’97 and ECOOP’99 papers

• Researchhttp://www.cs.ucsb.edu/~raimisl/Research.html

[email protected]

Page 55: Query-Based Debugging

55

Static Query Evaluation Time20.7

5.9

0

0.5

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Query number

Tim

e (s

ec)

Completion TimeResponse TimeTranslation TimePrimitive Time

Page 56: Query-Based Debugging

56

Program Slowdown

• Other join queries - 77-229 slowdown

• Microbenchmark

– Selection - 6.4 slowdown

– Hash join - 228 slowdown

– Nested join - 930 slowdown

1 2 3 4 5 6 7 8 9 10 11 12 13 140

0.5

1

1.5

2

2.5

3

3.5

Slo

wd

ow

n

Query number

5.83

Decaf

Gas tank

Jess

Compress

Ray tracer

Page 57: Query-Based Debugging

57

Breakdown of Query Overhead

• 76% Evaluation time

• 17% Loading

• 7% Garbage collection (128M heap)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

10

20

30

40

50

60

70

80

90

100

Ove

rhe

ad

pe

rce

nta

ge

Query number

Loading

GC

First evaluation

Evaluation