aspect-based introspection and change analysis for evolving programs [ramse @ ecoop07]
DESCRIPTION
We breaks as programs change. We introduce a tracing and analysis framework that leverages aspect-oriented programming and reflection to (a) capture partial or full execution traces with high-level semantics, (b) compare and analyze traces across program versions to aid in understanding the affects of code changes on program execution.TRANSCRIPT
Aspect-based Introspection
and Change Analysis for
Evolving Programs
Kevin Hoffman, Murali Krishna Ramanathan,
Patrick Eugster, Suresh Jagannathan
-2-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Outline
Problem overview
Review of dynamic impact analysis
New approach overview
Implementation details
Illustration
Performance measurements
Conclusions
-3-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Problem Overview
Evolvable systems should be
Able to easily change their behavior over time
Free from errors introduced by such change
Quick to detect and recover from any such errors
-4-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Problem Overview
Evolvable systems should be
Able to easily change their behavior over time
Free from errors introduced by such change
Quick to detect and recover from any such errors
-5-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Problem Overview
Quick to detect and recover from errors…
What can you do ahead of time to facilitate analysis when changes introduce errors?
How can you safely analyze a live system when errors have been introduced?
What can be done to make the amount of information that must be understood tractable for those performing root-cause analysis?
-6-
Our Contribution
A tool that can compare the runs of two
different AspectJ 5 programs (or 2
versions of the same program) and
determine with high precision where the
computations diverged and converged and
also having small overhead in the common
case
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-7-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Target Applications
Runtime Component Testing
New component versions can be loaded into the program environment and have a set of test cases run on them. Tool output can be compared to known good traces.
Root-cause analysis of new bugs
After new component versions are introduced and something goes wrong, tool output can be analyzed for discrepancies
-8-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Dynamic Impact Analysis
Source Code
Instrumentation
Compiling/Linking
Instrumentation
Instrumented
Program
For each version:
V1
Case 1 Case 2 Case 3
V2
o1 o2 o1 o2 o1 o2
Post-mortem Analysis
Differences
-9-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Challenges w.r.t. Evolvablity
Endpointing
Evolvable programs run for very long periods of time – need a robust and flexible way to indicate when a ―test case‖ should begin/end
Efficiency
Runtime overhead of instrumentation may be unacceptable for production use
-10-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
How Aspects & Reflection Help
Endpointing
Leverage powerful pointcut language of AspectJ to model endpoints
Efficiency
Leverage work put into optimizing AspectJ to minimize runtime overhead
Use pointcuts to precisely instrument only where needed and also to avoid ―hotspots‖
-11-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Evolvable-Friendly Tracing
Source Code
Java Compiler
Java Program
Load-time Weaver
Instrumented
Program
For each version:
Instrumentation Pointcuts
Endpointing Pointcuts
DynTracing Aspect Library
Configuration XML
Event Pointcuts
-12-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Evolvable-Friendly Tracing
Time1
Endpoint 1 Endpoint 2 Endpoint 3
Time2
o1 o2 o1 o2 o1 o2
Post-mortem Analysis
Differences
-13-
Implementation—Terminology
Program Trace
Record of all execution events within one thread
Trace Segment
A portion of a program trace denoted by endpoints
Trace Point
An approximation of a trace segment
Represents all pieces of a trace segment for each unique (method call, call stack) pair
List of events ordered as first encountered and remembers # of times each event encountered
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-14-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Trace Generation Process
Advice on
Instrumentation Pcuts
Advice on
Endpointing Pointcuts
DynTracer Aspect
Trace Data for
Trace Points
Advice on Event
Pointcuts
Find/Create Trace Point
Trace
active?
Event
already in
Trace Pt?
Callstack Aspect
Incr.
Counter
Append Event to TP
-15-
Implementation—Aspects
Callstack Aspect
Uses before/after advice on method execution to adjust per thread callstack representation
Uses thisjoinpoint to obtain source location and method signature
DynTracer Aspect
Monitors entrance to and exit from endpoints:cflow(pointcutEP) && !cflowbelow(pointcutEP)
On endpoint entrance, activate tracing for current thread and prepares data structures
On endpoint exit, write trace data and cleanup
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-16-
Implementation—Aspects
Advice on event pointcuts
Event is ignored if tracing is not active
Builds trace point identity using context
information and top k entries in call stack
Since we’re programmatically gathering context
information, context can easily be customized
Finds event in trace point
If not in TP, append event to end of TP
If in TP already, increment counter for event in TP
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-17-
Match trace points and for each matched pair:
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Implementation—Analysis Version 1 Trace Points
Trace Point Matching
Compute Longest Common Subsequence
(Each Trace Point forms one sequence;
an element in a sequence is one event)
Write Output for Human Analysis
Version 2 Trace Points
-18-
Illustration—Trace Points (A)
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Version A Source Version A Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4
-19-
Version A Source
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Illustration—Trace Points
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Version A Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4
Sha1 hash (truncated)
-20-
Version A Source
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Illustration—Trace Points
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Version A Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4Hit counter for event
-21-
Version A Source
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Illustration—Trace Points
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Version A Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4
Line # when first
encountered event
-22-
Version A Source
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Illustration—Trace Points
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Version A Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4
Event’s data (method
name and parameter vals)
-23-
Illustration—Trace Points (A)
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Version A Source Version A Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4
-24-
Illustration—Trace Points (B)
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
1) void m1(){
2) out.println("p1");
3) m2(true);
4) out.println("p2");
5) }
6) void m2(boolean b){
7) out.println("p3");
8) if (!b) m3();
9) out.println("p4");
10) m3();
11) }
12) void m3(){
13) out.println("p5");
14) }
Version B Source Version B Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.1: 13: println p5
56b1b1.1: 7: println p3
7f4bdb.1: 9: println p4
5b0e30.1: 10: m3
-25-
Illustration—Trace Points (A vs B)
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Version A Trace Points Version B Trace Points
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.1: 13: println p5
56b1b1.1: 7: println p3
7f4bdb.1: 9: println p4
5b0e30.1: 10: m3
Trace Point m1
Trace Point m2
Trace Point m3
81a9db.1: 2: println p1
fe3511.1: 3: m2 true
d8247b.1: 4: println p2
55cc39.2: 13: println p5
56b1b1.1: 7: println p3
5b0e30.2: 8: m3
7f4bdb.1: 9: println p4
|
|
|
X
|
<
|
>
-26-
Efficiency Considerations
Overhead when…
… code is instrumented, but no tracing is
being done?
… code is instrumented, but tracing is being
done on other threads?
… code is instrumented and tracing is active
on the current thread?
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-27-
Benchmark #1
Microbenchmark designed to understand
actual time overhead per method call
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
public void benchmark1(){
//Actual benchmark has 2 loops and loop unrolling
//Total number of calls varied to make running time similar
for (long a=0; a<(millions to billions); a++){
do_work(a,a,a);
}
//Avg. time for method call is (wall clock time) / (# calls)
}
public int do_work(int a, int b, int c){
return a+b+c;
}
-28-
Benchmark #1 Results
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
(~14 cycles) (~49 cycles)
(~1145 cycles)
(~3336 cycles)
-29-
Benchmark #2
Linpack benchmark used to understand
actual impact on application performance
Benchmark is computationally intensive
and makes over 100 million method calls
Models worst-case scenario for
performance by having calls to simple
methods within tight inner loops
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-30-
Benchmark #2 Results
Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
-31-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Conclusions
Java programs transparently instrumented
Overhead is acceptable in many cases
Behavioral differences between two pieces of code can be precisely observed
Endpointing via pointcuts is flexible and uses familiar syntax and semantics
Efficiency can be improved via selective instrumentation, specified via pointcuts
-32-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
http://www.kevinjhoffman.com/
-33-Kevin Hoffman et al, RAMSE-07 @ ECOOP-07
Future Directions
Accuracy improvementsUse of timestampsFirst-N, Last-M trace points
Dynamic pointcut change analysis Integrating with dynamic AOP IDE integration (Eclipse plugin) Case study on large AspectJ programs Formalism of technique and insights that can
be gained therefrom