how history justifies system architecture (or not)
DESCRIPTION
Presented at IWPSE 2003.TRANSCRIPT
1/12
�
�
�
�
�
�
International Workshop on Principles of Software Evolution · Helsinki, Finland, 1 September 2003
How History JustifiesSystem Architecture (or not)
Thomas Zimmermann(with Stephan Diehl and Andreas Zeller)
Lehrstuhl SoftwaretechnikUniversitat des Saarlandes, Saarbrucken, Germany
2/12
�
�
�
�
�
�
The Problem
Your task: extend the debug component in GCC!
You identify the variable xcoff debug hooks.
What else do you need to change?
2/12
�
�
�
�
�
�
The Problem
Your task: extend the debug component in GCC!
You identify the variable xcoff debug hooks.
What else do you need to change?
General issue: only change coupled entities!
You can detect existing coupling by
• Program Analysis—e.g. def-use associations.
• Learning from History—entities changed together.
3/12
�
�
�
�
�
�
Evolutionary Coupling
gcc/gcc/dbxout.c [134] gcc/gcc/sdbout.c [74]34
dbx_debug_hooks
xcoff_debug_hooks
sdb_debug_hooks
[12]
[10]
[12]
3/12
�
�
�
�
�
�
Evolutionary Coupling
gcc/gcc/dbxout.c [134] gcc/gcc/sdbout.c [74]34
dbx_debug_hooks
xcoff_debug_hooks
sdb_debug_hooks
[12]
[10]
[12]12
10 10
dbx_debug_hooks
xcoff_debug_hooks
sdb_debug_hooks
[12]
[10]
[12]
Support: How much evidence (= simultaneous changes)?Confidence: How relevant is coupling for participants?
3/12
�
�
�
�
�
�
Evolutionary Coupling
gcc/gcc/dbxout.c [134] gcc/gcc/sdbout.c [74]34
dbx_debug_hooks
xcoff_debug_hooks
sdb_debug_hooks
[12]
[10]
[12]
4
4
xcoff_debug_hooks sdb_global_decl()
[4]
4
dbx_functions_end()
[7]
dbx_symbol_name()
[6] 2
12
10 10
dbx_debug_hooks
xcoff_debug_hooks
sdb_debug_hooks
[12]
[10]
[12]
Support: How much evidence (= simultaneous changes)?Confidence: How relevant is coupling for participants?
4/12
�
�
�
�
�
�
What We Do
Our ROSE prototype analyzes evolution of CVS archives.
ROSE
Step 1: Restore Transactions from CVS
Step 2: Identify Modified Entities
Reengineering Of Software EvolutionCVS
Metrics
Couplings
Graphs
ROSE determines entities at different granularities:
coarse-granular entities: directories, modules, files
fine-granular entities: methods, variables, sections
5/12
�
�
�
�
�
�
Step 1: Restoring Transactions
Two atomic changes δi and δi+1 are part of onetransaction ∆ = (δ1, . . . , δn) if:
author(δi) = author(δi+1)∧log message(δi) = log message(δi+1)∧
|time(δi+1)− time(δi)| < 〈200 seconds〉
We use a sliding window instead of a fixed one.
GNU C Compiler (GCC):The average transaction length is 6.2 seconds.The maximal transaction length is 1 hour 32 minutes.
6/12
�
�
�
�
�
�
Step 2: Light-Weight Analysis
File: Animals.java
1
3
23
25
30
56
58
60
80
99
public String[] COLORS = {
public String[] COLORS = {
class Cat {
public Cat() {
...}
...}
...}
class Dog {
...}
...}
6/12
�
�
�
�
�
�
Step 2: Light-Weight Analysis
File: Animals.java
1
3
23
25
30
56
58
60
80
99
Cat.COLORSlines 3-23
Cat.Cat()lines 25-30
Class Catlines 1-56
Dog.COLORSlines 60-80
Class Doglines 58-99
Step A: Map to Entities
public String[] COLORS = {
public String[] COLORS = {
class Cat {
public Cat() {
...}
...}
...}
class Dog {
...}
...}
6/12
�
�
�
�
�
�
Step 2: Light-Weight Analysis
File: Animals.java
1
3
23
25
30
56
58
60
80
99
Cat.COLORSlines 3-23
Cat.Cat()lines 25-30
Class Catlines 1-56
Dog.COLORSlines 60-80
Class Doglines 58-99
Step A: Map to Entities
Cat.COLORSlines 3-23
Cat.Cat()lines 25-30
Class Catlines 1-56
Dog.COLORSlines 60-80
Class Doglines 58-99
Step B: Filter Entities
17public String[] COLORS = {
public String[] COLORS = {
class Cat {
public Cat() {
...}
...}
...}
class Dog {
...}
...}
We analyze C/C++, JAVA, PYTHON, TEX and TEXINFO files.We get the modified methods, variables and subsections.
7/12
�
�
�
�
�
�
Example: GCC
i386_costk6_cost
i486_cost
pentium_cost
pentiumpro_cost
[11]
[11]
[12]
[12]
[14]
i386.c
7/12
�
�
�
�
�
�
Example: GCC
11
i386_costk6_cost
i486_cost
pentium_cost
pentiumpro_cost
[11]
[11]
[12]
[12]
[14]
i386.c
7/12
�
�
�
�
�
�
Example: GCC
11
i386_costk6_cost
i486_cost
pentium_cost
pentiumpro_cost
[11]
[11]
[12]
[12]
[14]
i386.c
processor_cost
[11]
i386.h
9
8/12
�
�
�
�
�
�
Visualizing Coupling
A DCB
A
D
C
B
High Confidence
Low ConfidenceNo Coupling (No Support)
8/12
�
�
�
�
�
�
Visualizing Coupling
A DCB
A
D
C
B
High Confidence
Low ConfidenceNo Coupling (No Support)
[10] [4][3]
A CA⇒ C: Confidence 3/10 = 30%C ⇒ A: Confidence 3/4 = 75%
9/12
�
�
�
�
�
�
Comparing Architecture with Evolution
DDD Source
LibrariesPicsIcons
Patches
Tests
Directoryddd/
9/12
�
�
�
�
�
�
Comparing Architecture with Evolution
Bad architecture
Better architecture
DDD Source
LibrariesPicsIcons
Patches
Tests
Directoryddd/
10/12
�
�
�
�
�
�
Measuring Evolutionary Coupling
Evolutionary Coupling Index (ECI).Different levels: entity/file or file/directory level.
ECI = # external couplings# internal couplings
The lower the ECI, the better the modularity.
10/12
�
�
�
�
�
�
Measuring Evolutionary Coupling
Evolutionary Coupling Index (ECI).Different levels: entity/file or file/directory level.
ECI = # external couplings# internal couplings
The lower the ECI, the better the modularity.
File/Directory Entity/FileECI ECI ECIfiltered
GCC 5.757 3.615 1.504DDD 0.250 4.462 1.922APACHE 2.827 11.815 0.675OPENSSL 8.665 101.053 7.859
Comparing only one level may be misleading (DDD).
11/12
�
�
�
�
�
�
Guiding the Programmer
12/12
�
�
�
�
�
�
Conclusion
Fine-grained evolutionary coupling. . .
• detects coupling between non-program entities.e.g. coupling between a function and a database schema
• guides developers while making changes.Programmers who changed this function also changed. . .
• gives better(?) results than coarse-grained coupling.
Coupling between files doesn’t tell you that much
• can be compared with given coupling (= architecture).
Results are mixed—what is coupling, anyway?
Those who cannot learn from history are doomed to repeat it.(George Santayana)