a unified framework for the comprehension of software’s time dimension by omar benomar...
TRANSCRIPT
A UNIFIED FRAMEWORK FOR THE COMPREHENSION OF SOFTWARE’S
TIME DIMENSION
By Omar Benomar
Collaborators: Dr. Houari Sahraoui, Dr. Pierre Poulin, Dr. Hani Abdeen
and Mohamed Aymen Saied
Work to be presented at ICSE-NIER’15
2
Overview• Introduction
• Time Representation in Software Comprehension
• Unification Approach
• Application: Entity Collaboration
• Application: Phase Identification
• Conclusion
Introduction• Software maintenance
• Modification after deployment• Longest in terms of time• Resource consuming
• 50-80% of total development cost (Coleman et al. 1994)
• Different types
• Comprehension task• 50% of maintenance effort (Corbi et al. 1989)• Several software dimensions• Mental model of software
3
www.mashable.com
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
4
Introduction• Comprehension even more central in software
development
• Software development• Teams in different locations • Long period of time: team restructuring• Rarely from scratch
• Software engineering tools and techniques• Two research communities
• Evolution comprehension • Execution comprehension
www.ciklum.com
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
5
Time Representation
• Software comprehension involving time• Study of evolution• Analysis of execution
• Time representation• Visualization
• Axis• Graphical attribute• Animation
• Automatic approaches • Sequence of events
• Comprehension models
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
Wu et al. 2004
Bohnet et al. 2009
Wettel and Lanza 2008
Isaacs et al. 2014
Langelier et al. 2008
Dugerdil and Alam 2008
Xing and Stroulia 2004
Pirzadeh et al. 2010
Girba and Ducasse 2005
Lienhard et al. 2007
6
Time Representation
• Considered from two different contexts• Studied in separate manners• Belong to different research communities
• Different in appearance
• Close examination reveal similarities• Unifying perspective not studied before
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
7
Research proposition
“Software comprehension problems involving time dimension should be analyzed using a unified framework, which enables
easier communication of solutions between evolution comprehension and execution comprehension research”
8
Approach OverviewComprehension of Software’s Time Dimension
Execution Comprehension Unification Evolution Comprehension
Common Comprehension Model
Entity Collaboration
Phase Identification
express
Software Visualization
Search-Based
Optimization
Heat Maps for Classes’ roles in
Use-Case Scenarios
Heat Maps for Developers
ContributionsAnalysis
Genetic Algorithm for Execution
Phases Detection
Genetic Algorithm for Evolution
Phases Identification
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
10
Application: Entity CollaborationComprehension of Software’s Time Dimension
Execution Comprehension Unification Evolution Comprehension
Common Comprehension Framework
Entity Collaboration
express
Software Visualization
Heat Maps for Analysis of Classes’ roles in Use-Case
Scenarios
Heat Maps for Developers
Contributions
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
Entity Collaboration
Evolution Execution
• Software history• Developers contributions to classes
• Commits • Source code changes
• Visual analysis• Changes by each developer• Combination of multiple developers’ contributions
• Execution trace of use cases• Classes roles in use cases
• Method executions• Class activity
• Visual analysis• Degree of class activity in use case• Combination of classes’ activities
11
Problem representation • Sequence• Event• Entity (subject or object)• Property• Entity contribution• EC Aggregation
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
12
Heat Map in VERSO• VERSO
• 3D graphical representation: classes, packages
• Heat map• 2D graphical elements • Heat distribution metaphor• Mapping entity contribution to colors
• Heat map color ramp must not interfere with existing colors
t
hc
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
13
Heat Map Example
• 1 package
• 3 classes: A, B, C
• Class A not involved in any event (no heat map color)
• Class B contribution larger than that of class C
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
14
Heat Map Visualization• Element placement
• Concrete representations• Elements have fixed natural positions
• Software is intangible • Elements do not have predefined positions
• Element placement to ease the visual analysis
• Similar heat map colors closer
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
15
Element Placement
• Treemap Layout
• Provide 2 degrees of freedomSibling classes and sibling packages
• Layout optimization by Simulated Annealing• Solution = Element placement• Neighbour solution = Sibling packages & classes
swapping (exploit 2 degrees of freedom)• Fitness function = Interesting classes Manhattan
distances
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
16
Element Placement Optimization
Initial solution (random) Optimized solution
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
17
Visual Analysis• Heat map
• Entity contribution representation or an aggregation
• Interactive visualization features• 3D camera: zooming, scene navigation, synchronization• Visual cluttering options: 3D box properties, invisible boxes• Color scale manager: color filtering, color re-mapping, histogram
• Complex analysis tasks with heat maps comparison• Color weaving• Flipping • Multiple windows
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
18
Filtering Options
Filtering Re-mapping
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
22
Case Studies
• Case study 1: JHotDraw GUI framework• Evolution visualization• 4 versions: 5.2, 5.3, 5.4.1 et 6.0.1 • 171 to 498 classes, 14 to 36 packages • 8 developers
• Cast Study 2: Pooka email client• Execution visualization• 301 classes, 32 packages• 37 execution traces• 3 use cases: Read email, Inbox actions,
Search mail
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
23
Visual Analysis Example (Evolution)
• Developers questions (Fritz and Murphy 2010):
A. Who is working on what?
B. Who has made changes to my classes?
C. Which classes has been changed most?
• Generate developers’ contribution heat maps, one per developerQuestions: A, B
• Generate an aggregate heat map for all developers’ contributionsQuestions: C
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
24
• Bob contribution to JHotDraw 5.4.1
• Jane contribution to JHotDraw 5.4.1
• Heat map comparison to reveal collaborations
• Examples• Class ZoomDrawingView in package contrib.zoom
(Bob only)• Class JavaDrawApplet in package applet (Jane only)• Class JavaDrawApp in package samples.javadraw
(Bob more than Jane)
Visual Analysis Example (Evolution)
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
25
Visual Analysis Example (Execution)
• Identification of classes responsible for email attachment management
• Scenario 1: Open email + close email • Scenario 2: Open email + open attachment + close email
• 31 classes involved in Scenario 2 out of 301 in Pooka
• Narrowed down to 6 classes
• New classes: Attachment, AttachmentPane and AttachmentBundle
• More active: Pooka, PookaManager, MessageInfo
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
26
Application: Phase IdentificationComprehension of Software’s Time Dimension
Execution Comprehension Unification Evolution Comprehension
Common Comprehension Framework
Phase Identification
express
Search-Based
Optimization
Genetic Algorithm for Execution
Phases Detection
Genetic Algorithm for Evolution
Phases Identification
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
27
Phase Identification
Evolution Execution
• Software history• Development stages of software
• Commits to classes• Changes to source code over time
• Heuristic search• Similarity of changes within phase commits• Dissimilarity of changes between two successive
phases
• Execution trace• Execution parts relating to features
• Method executions• Object lifetimes
• Heuristic search• Similar objects activity within a phase• Different objects from one phase to
another
Problem representation • Sequence• Event• Entity (subject or object)• Property• Phase
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
28
Optimization Problem• Phases
• Subsets of consecutive sequence events
• Phase identification • Search optimal decomposition of sequence events
• Solution• partition of sequence events into phases
• Near-optimal solution • Solution parts relate to meaningful phases
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
29
Search Space
• All possible decompositions of sequence of events into phases
• Solutions can have any number of phases
• possible solutions:
• is the number of potential phase switching positions (in the order of the number of events)• is the actual number of phase transitions
• Genetic algorithm to find the best solution to phase identification problem
lkC
lk
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
30
Solution Encoding
• Solutions • Vector of integers• Event index (cut position)
• Cut position • Execution context
• Method return followed by method call
• Evolution context• Last commit of each development day
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
31
Genetic Algorithm• Initial population of solutions is created randomly
• Different number of phases• No maximum number of phases
• Each iteration• New population
• Elect fittest solution (Elitism)• Select good parents for reproduction (Selection)
• Roulette-wheel• Tournament selection
• Generate solutions• Existing genetic material (Crossover)• New genetic material (Mutation)
• Fitness based on heuristics
N
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
32
Heuristics for Phase Identification
Evolution Execution
• Based on structural changes
• Two successive phases have minimum common committed classes
• Two successive phases have minimum common types of changes
• Classes undergo similar changes within a phase
• Based on object lifetimes
• Two successive phases have minimum common active objects
• Few objects come from previous phases
• Objects are created at the begining of a phase and destroyed before its end
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
33
Fitness Function• Heuristics translated to measurable properties
Evolution Execution
• Entity Coupling: Ratio of common committed entities in two phases (ETCp)
• Change-Type Coupling: Ratio of common change types between two phases (CTCp)
• Change-Importance Cohesion: Variance of change importance in a phase (CICh)
• Development-Rate Cohesion: Variance of mean time between phase commits (DRCh)
• Object_metric: score for objects’ lifetimes within two phases.
• Phase_coupling: Ratio of common active objects in two phases
• Thin_cut: Ratio of object traversing cut position
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
34
Fitness Function (evolution)• Evaluation of a given solution’s quality
where is the solution (history partition) to be evaluated
• Metric values have different ranges and are not normalized• Use geometric mean to compare solutions’ qualities
sol
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
35
Fitness Function (execution)• Evaluation of a given solution’s quality
where is the solution (trace partition) to be evaluated and , and are weights affected to each component
• All the values are normalized between and the algorithm maximizes the solutions fitness
sol a b c
1,0
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
36
Crossover Operator
• Single point crossover
• Randomly pick a cut position in the two parent solutions
• New cut position produces four parts
• Perform crossover between the two parents
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
37
Mutation Operator
• Three different mutations with equal probability
• Add a new cut position (A1)
• Remove a cut position (A2)
• Perturb a cut position (A3)
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
38
Evaluation
Evolution Execution
Applications ArgoUML, JFreeChart, ICEFaces JHotDraw, Pooka
Data 2 to 11 years of development882 to 9150 commits 1088 to 2748 classes4 to 15 releases
7 execution scenarios63162 to 105069 execution eventsEx: Initialization, Open File, Draw Circle, Save File
Tool ChangeDistiller1 for structural diffSubversion
JVMTI2 API to record Methods entry/exit
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
1 http://www.ifi.uzh.ch/seal/research/tools/changeDistiller.html2 http://docs.oracle.com/javase/6/docs/technotes/guides/jvmti
39
Evaluation
Evolution Execution
Reference Official release dates Tags representing end of each feature in the trace
Measure Distance: days between computed and reference evolution events
Stability: similarity between computed solutions
Recall of discovered release events
Precision and recall of phases
Precision and recall of events
DE
AEDEAEDEprecisionevent
,
AE
AEDEAEDErecallevent
),(
Detected
ActualDetectedprecision phase
Actual
ActualDetectedrecall phase
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
40
Results (Evolution)• Different runs of the algorithm generate similar solutions• Most software releases discovered (45/49)• Software releases events identified with good accuracy (days)
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
41
Results (Execution)• Detected most phases with high precision• Initialization phase difficult to detect, better results when omitted
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
42
Conclusion• Propose a unified view of software comprehension problems involving time
dimension• Define a common comprehension framework• Express entity collaboration comprehension with our model in both contexts
• Evolution: Developers’ contributions to software evolution• Execution: Classes’ roles in executions of use-case scenarios
• Use software visualization with the same metaphor for analysis and comprehension
• Express phase identification problem with our model in both contexts• Evolution: Phases in terms of development activities• Execution: Phases relating to high-level software features
• Use genetic algorithm to detect both evolution and execution phases
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
43
Future Work
• Specific research directions• User study to evaluate heat map visualization• Object profiles in execution traces• Evolution phases patterns across different software releases
• Future perspectives• Use of unified framework to express and analyze
• Debugging and profiling in evolution• ‘Co-change’ in execution
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion
44
Publications• O. Benomar, H. Sahraoui, and P. Poulin. A unified framework for the comprehension of
software’s time dimension. In International Conference on Software Engineering, (to appear) ICSE-NIER, 2015.
• O. Benomar, H. Sahraoui, and P. Poulin. Visualizing software dynamicities with heat maps. In Working Conference on Software Visualization, VISSOFT, 2013.
• O. Benomar, H. Sahraoui, and P. Poulin. Detecting program execution phases using heuristic search. In International Symposium on Search-Based Software Engineering, SSBSE, 2014.
• O. Benomar, H. Abdeen, H. Sahraoui, P. Poulin and M. A. Saied. Detection of software evolution phases based on development activities. In International Conference on Program Comprehension, (to appear) ICPC, 2015.
Intro Time Repr. Unification Entity Collab. Phase Identif. Conclusion