aspect-based introspection and change analysis for evolving programs [ramse @ ecoop07]

Aspect-based Introspection

and Change Analysis for

Evolving Programs

Kevin Hoffman, Murali Krishna Ramanathan,

Patrick Eugster, Suresh Jagannathan

-2-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07

Outline

Problem overview

Review of dynamic impact analysis

New approach overview

Implementation details

Illustration

Performance measurements

Conclusions


Problem Overview

Evolvable systems should be

Able to easily change their behavior over time

Free from errors introduced by such change

Quick to detect and recover from any such errors


Problem Overview

Quick to detect and recover from errors…

What can you do ahead of time to facilitate analysis when changes introduce errors?

How can you safely analyze a live system when errors have been introduced?

What can be done to make the amount of information that must be understood tractable for those performing root-cause analysis?

-6-

Our Contribution

A tool that can compare the runs of two

different AspectJ 5 programs (or 2

versions of the same program) and

determine with high precision where the

computations diverged and converged and

also having small overhead in the common

case

Kevin Hoffman et al, RAMSE-07 @ ECOOP-07


Target Applications

Runtime Component Testing

New component versions can be loaded into the program environment and have a set of test cases run on them. Tool output can be compared to known good traces.

Root-cause analysis of new bugs

After new component versions are introduced and something goes wrong, tool output can be analyzed for discrepancies


Dynamic Impact Analysis

Source Code

Instrumentation

Compiling/Linking

Instrumentation

Instrumented

Program

For each version:

V1

Case 1 Case 2 Case 3

V2

o1 o2 o1 o2 o1 o2

Post-mortem Analysis

Differences


Challenges w.r.t. Evolvablity

Endpointing

Evolvable programs run for very long periods of time – need a robust and flexible way to indicate when a ―test case‖ should begin/end

Efficiency

Runtime overhead of instrumentation may be unacceptable for production use


How Aspects & Reflection Help

Endpointing

Leverage powerful pointcut language of AspectJ to model endpoints

Efficiency

Leverage work put into optimizing AspectJ to minimize runtime overhead

Use pointcuts to precisely instrument only where needed and also to avoid ―hotspots‖


Evolvable-Friendly Tracing

Source Code

Java Compiler

Java Program

Load-time Weaver

Instrumented

Program

For each version:

Instrumentation Pointcuts

Endpointing Pointcuts

DynTracing Aspect Library

Configuration XML

Event Pointcuts


Evolvable-Friendly Tracing

Time1

Endpoint 1 Endpoint 2 Endpoint 3

Time2

o1 o2 o1 o2 o1 o2

Post-mortem Analysis

Differences

-13-

Implementation—Terminology

Program Trace

Record of all execution events within one thread

Trace Segment

A portion of a program trace denoted by endpoints

Trace Point

An approximation of a trace segment

Represents all pieces of a trace segment for each unique (method call, call stack) pair

List of events ordered as first encountered and remembers # of times each event encountered



Trace Generation Process

Advice on

Instrumentation Pcuts

Advice on

Endpointing Pointcuts

DynTracer Aspect

Trace Data for

Trace Points

Advice on Event

Pointcuts

Find/Create Trace Point

Trace

active?

Event

already in

Trace Pt?

Callstack Aspect

Incr.

Counter

Append Event to TP

-15-

Implementation—Aspects

Callstack Aspect

Uses before/after advice on method execution to adjust per thread callstack representation

Uses thisjoinpoint to obtain source location and method signature

DynTracer Aspect

Monitors entrance to and exit from endpoints:cflow(pointcutEP) && !cflowbelow(pointcutEP)

On endpoint entrance, activate tracing for current thread and prepares data structures

On endpoint exit, write trace data and cleanup


-16-

Implementation—Aspects

Advice on event pointcuts

Event is ignored if tracing is not active

Builds trace point identity using context

information and top k entries in call stack

Since we’re programmatically gathering context

information, context can easily be customized

Finds event in trace point

If not in TP, append event to end of TP

If in TP already, increment counter for event in TP


-17-

Match trace points and for each matched pair:


Implementation—Analysis Version 1 Trace Points

Trace Point Matching

Compute Longest Common Subsequence

(Each Trace Point forms one sequence;

an element in a sequence is one event)

Write Output for Human Analysis

Version 2 Trace Points

-18-

Illustration—Trace Points (A)


1) void m1(){

2) out.println("p1");

3) m2(true);


5) }

6) void m2(boolean b){


8) if (b) m3();


10) m3();

11) }

12) void m3(){


14) }

Version A Source Version A Trace Points

Trace Point m1

Trace Point m2

Trace Point m3

81a9db.1: 2: println p1

fe3511.1: 3: m2 true

d8247b.1: 4: println p2

55cc39.2: 13: println p5

56b1b1.1: 7: println p3

5b0e30.2: 8: m3

7f4bdb.1: 9: println p4

-19-

Version A Source

1) void m1(){


3) m2(true);


5) }



8) if (b) m3();


10) m3();

11) }

12) void m3(){


14) }

Illustration—Trace Points


Version A Trace Points

Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true




5b0e30.2: 8: m3


Sha1 hash (truncated)

-20-

Version A Source

1) void m1(){


3) m2(true);


5) }



8) if (b) m3();


10) m3();

11) }

12) void m3(){


14) }




Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true




5b0e30.2: 8: m3

7f4bdb.1: 9: println p4Hit counter for event

-21-

Version A Source

1) void m1(){


3) m2(true);


5) }



8) if (b) m3();


10) m3();

11) }

12) void m3(){


14) }




Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true




5b0e30.2: 8: m3


Line # when first

encountered event

-22-

Version A Source

1) void m1(){


3) m2(true);


5) }



8) if (b) m3();


10) m3();

11) }

12) void m3(){


14) }




Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true




5b0e30.2: 8: m3


Event’s data (method

name and parameter vals)

-23-

Illustration—Trace Points (A)


1) void m1(){


3) m2(true);


5) }



8) if (b) m3();


10) m3();

11) }

12) void m3(){


14) }

Version A Source Version A Trace Points

Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true




5b0e30.2: 8: m3


-24-

Illustration—Trace Points (B)


1) void m1(){


3) m2(true);


5) }



8) if (!b) m3();


10) m3();

11) }

12) void m3(){


14) }

Version B Source Version B Trace Points

Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true





5b0e30.1: 10: m3

-25-

Illustration—Trace Points (A vs B)


Version A Trace Points Version B Trace Points

Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true





5b0e30.1: 10: m3

Trace Point m1

Trace Point m2

Trace Point m3


fe3511.1: 3: m2 true




5b0e30.2: 8: m3


|

|

|

X

|

<

|

>

-26-

Efficiency Considerations

Overhead when…

… code is instrumented, but no tracing is

being done?

… code is instrumented, but tracing is being

done on other threads?

… code is instrumented and tracing is active

on the current thread?


-27-

Benchmark #1

Microbenchmark designed to understand

actual time overhead per method call


public void benchmark1(){

//Actual benchmark has 2 loops and loop unrolling

//Total number of calls varied to make running time similar

for (long a=0; a<(millions to billions); a++){

do_work(a,a,a);

}

//Avg. time for method call is (wall clock time) / (# calls)

}

public int do_work(int a, int b, int c){

return a+b+c;

}

-28-

Benchmark #1 Results


(~14 cycles) (~49 cycles)

(~1145 cycles)

(~3336 cycles)

-29-

Benchmark #2

Linpack benchmark used to understand

actual impact on application performance

Benchmark is computationally intensive

and makes over 100 million method calls

Models worst-case scenario for

performance by having calls to simple

methods within tight inner loops


-30-

Benchmark #2 Results



Conclusions

Java programs transparently instrumented

Overhead is acceptable in many cases

Behavioral differences between two pieces of code can be precisely observed

Endpointing via pointcuts is flexible and uses familiar syntax and semantics

Efficiency can be improved via selective instrumentation, specified via pointcuts


http://www.kevinjhoffman.com/


Future Directions

Accuracy improvementsUse of timestampsFirst-N, Last-M trace points

Dynamic pointcut change analysis Integrating with dynamic AOP IDE integration (Eclipse plugin) Case study on large AspectJ programs Formalism of technique and insights that can

be gained therefrom

aspect-based introspection and change analysis for evolving programs [ramse @ ecoop07]

Business