7. metrics in reengineering context outline introduction metrics and measurements some metrics...

7. Metrics in Reengineering Context

Outline• Introduction• Metrics and

Measurements• Some metrics• Conclusion

The Reengineering Life-Cycle

Requirements

Designs

Code

(0) requirementanalysis

(1) modelcapture

(2a) problemdetection (3) problem

resolution

(4) Code Transformation

(2) Problem detectionissuesAcademic vs. Practical

(2b) Reverse Engineering

RoadMap• Some definitions• Metrics and Measurement• Metrics for reverse engineering• Selection of OO metrics • Metrics for trends analysis• Step back and look

Why Metrics in OO Reengineering?

Estimating CostIs it worthwhile to reengineer, or is it

better to start from scratch? Not covered in this lecture but… Difficult:

Company should keep data: heavy process Good data not Cobol figures for C# project

Why Metrics in OO Reengineering (ii)?

• Assessing Software QualityWhich components have poor quality?

(Hence could be reengineered)Which components have good quality?

(Hence should be reverse engineered) Metrics as a reengineering tool!

• Controlling the Reengineering ProcessTrend analysis: which components did

change?Which refactorings have been applied? Metrics as a reverse engineering tool!

Quantitative Quality Model (i)

• Quality according to ISO 9126 standardDivide-and conquer approach via

“hierarchical quality model”Leaves are simple metrics, measuring

basic attributes Not really useful but worth to know

Quantitative Quality Model (ii)

SoftwareQuality

Functionality

Reliability

Efficiency

Usability

Maintainability

Portability

ISO 9126 Factor Characteristic Metric

Error tolerance

Accuracy

Simplicity

Modularity

Consistency

defect density= #defects / size

correction impact= #components

changed

correction time

Product & Process Attributes

Product AttributeDefinition: measure aspects of artifacts delivered to the customerExample: number of system defects perceived, time to learn the system

Process AttributeDefinition: measure aspects of the process which produces a productExample: time to correct defect, number of components changed per correction

External & Internal Attributes

External AttributeDefinition: measures how the product/process behaves in its environmentExample: mean time between failure, #components changed

Internal AttributeDefinition: measured purely in term of the product, separate from its behaviour in contextExample: class coupling and cohesion, method size

External vs. Internal Product Attributes

External Internal

Advantage: • close relationship with quality

factors

Disadvantage:• relationship with quality

factors is not empirically validated

Disadvantages:•measure only after the

product is used or process took place

•data collection is difficult; often involves human intervention/interpretation

•relating external effect to internal cause is difficult

Advantages:• can be measured at any

time• data collection is quite

easy and can be automated

• direct relationship between measured attribute and cause

Metrics and Measurements• [Wey88] defined nine properties that a software

metric should hold. Read [Fenton] for critiques. • For OO only 6 properties are really interesting

[Chid 94, Fenton]1. Noncoarseness:

• Given a class P and a metric m, another class Q can always be found such that m (P) m(Q)

• not every class has the same value for a metric

2. Nonuniqueness. • There can exist distinct classes P and Q such that m(P) =

m(Q)• two classes can have the same metric

3. Monotonicity• m(P) m (P+Q) and m(Q) m (P+Q), P+Q is the

“combination” of the classes P and Q.

Metrics and Measurements (ii)4. Design Details are Important

• The specifics of a class must influence the metric value. Even if a class performs the same actions details should have an impact on the metric value.

5. Nonequivalence of Interaction• m(P) = m(Q) m(P+R) = m(Q+R) where R is an

interaction with the class.

6. Interaction Increases Complexity• m(P) + (Q) < m (P+Q). • when two classes are combined, the interaction

between the too can increase the metric value

Conclusion: Not every measurement is a metric.

Selecting Metrics• Fast

Scalable: you can’t afford log(n2) when n >= 1 million LOC

• Precise (e.g. #methods — do you count all methods, only

public ones, also inherited ones?) Reliable: you want to compare apples with apples

• Code-based Scalable: you want to collect metrics several times Reliable: you want to avoid human interpretation

• Simple (e.g. average number of arguments vs. locality of data

[LD = SUM |Li | / SUM |Ti |] ) Reliable: complex metrics are hard to interpret

Metrics for Reverse Engineering• Size of the system, system entities

Class size, method size, inheritance The intuition: a system should not contain too much big

entities really big entities may be problematic can be really difficult

and complex to understand

• Cohesion of the entities Class internals The intuition: a good system is composed by cohesive

entities

• Coupling between entities Within inheritance: coupling between class-subclass Outside of inheritance The intuition: the coupling between entities should be limited

Sample Size and Inheritance Metrics

Class

AttributeMethodAccess

Invoke

BelongTo

Inherit

Inheritance Metricshierarchy nesting level (HNL)# immediate children (NOC)# inherited methods, unmodified (NMI)# overridden methods (NMO)

Class Size Metrics# methods (NOM)# instance attributes (NIA, NCA)# Sum of method size (WMC)

Method Size Metrics# invocations (NOI)# statements (NOS)# lines of code (LOC)

Sample class Size• (NIV)

[Lore94] Number of Instance Variables (NCV) [Lore94] Number of Class Variables (static)

(NOM) [Lore94] Number of Methods (public, private,

protected) (E++, S++)

• (LOC) Lines of Code• (NSC) Number of semicolons [Li93]->

number of Statements • (WMC) [Chid94] Weighted Method Count

WMC = SUM ci where c is the complexity of a method (number

of exit or McCabe Cyclomatic Complexity Metric)

Class Complexity• (RFC) Response For a Class

[Chid94]Response Set for a Class (RS) is the

set of methods that can be executed in response to a message.

RS = {M} U {Ri}, RFC = | RS |

where {Ri} is the set of methods called by method i and {M} the set of all the methods in the class.

Hierarchy Layout• (HNL) [Chid94] Hierarchy Nesting Level ,

(DIT) [Li93] Deep of Inheritance Tree, • HNL, DIT = max hierarchy level• (NOC) [Chid94] Number of Children • (WNOC) Total number of Children • (NMO, NMA, NMI, NME) [Lore94] Number

of Method Overridden, Added, Inherited, Extended (super call)

• (SIX) [Lore94]SIX (C) = NMO * HNL / NOMWeighted percentage of Overridden Methods

Method Size• (MSG) Number of Message Sends• (LOC) Lines of Code• (MCX) Method complexity

Total Number of Complexity / Total number of methods

API calls= 5, Assignment = 0.5, arithmetics op = 2, messages with params = 3....

Sample Metrics: Class Cohesion

• (LCOM) Lack of Cohesion in Methods [Chid94] for definition[Hitz95a] for critique

Ii = set of instance variables used by method Mi

let P = { (Ii , Ij ) | Intersection (Ii , Ij ) is Empty,

Q = { (Ii , Ij ) | Intersection (Ii , Ij ) is not Emptyif all the sets are empty, P is emptyLCOM =|P| - |Q| if |P|>|Q|

= 0 otherwise• Tight Class Cohesion (TCC)• Loose Class Cohesion (LCC)

[Biem95a] for definitionMeasure method cohesion across invocations

Sample Metrics: Class Coupling (i)

• Coupling Between Objects (CBO)[Chid94a] for definition, [Hitz95a] for a discussion Number of other classes to which it is coupled

• Data Abstraction Coupling (DAC)[Li93a] for definition Number of ADT’s defined in a class

• Change Dependency Between Classes (CDBC)[Hitz96a] for definition Impact of changes from a server class (SC) to a

client class (CC).

Sample Metrics: Class Coupling (ii)

• Locality of Data (LD)[Hitz96a] for definitionLD = SUM |Li | / SUM |Ti | Li = non public instance variables +

inherited protected of superclass

+ static variables of the classTi = all variables used in Mi, except

non-static local variablesMi = methods without accessors

Metrics? Stepping Back• About the impact of the computation• Examples:

number of attributes should we count private attributes in NIV? Why not?

number of methods (private, protected, public, static, instance, operator, constructeurs, friends)

• What to do? Try first simple metrics, with simple extraction Take care about absolute threshold Metrics are good as a differential Metrics should be etalonned Do not numerically combine them: what is the

multiplication of oranges and apples: Jam!

20%/80%• Take care of thresholds

Average line number in Smalltalk 7 lines

So what?

• 20% outliers for 80% ok

“Define your own” Quality Model• Define the quality model with the development

team• Team chooses the characteristics, design

principles, internal product metrics...• ... and the thresholds

Maintainability

Factor Characteristic Design Principle Metric

Modularity

design class as an abstract data-type

encapsulate all attributes

avoid complex interfaces

number of private attributes ]2, 10[

number of public attributes ]0, 0[

number of public methods ]5, 30[

average number of arguments [0, 4[

Conclusion: Metrics for Quality Assessment• Can internal product metrics reveal which components have

good/poor quality?• Yes, but...

Not reliable• false positives: “bad” measurements, yet good quality• false negatives: “good” measurements, yet poor quality

Heavy Weight Approach• Requires team to develop (customize?) a quantitative quality model• Requires definition of thresholds (trial and error)

Difficult to interpret• Requires complex combinations of simple metrics

• However... Cheap once you have the quality model and the thresholds Good focus (± 20% of components are selected for further

inspection)• Note: focus on the most complex components first!

Trend Analysis via Change Metrics• Change Metric

Definition: difference between two metric values for the same metric and the same component in two subsequent releases of the software system

• Examples: difference between number of methods for class “Event” in release 1.0 and 1.1 difference between lines of code for method

“Event::process()” in release 1.0 and 1.1

• Change Assumption Changes in metric values indicate changes in the system

Conclusion: Metrics for Trend Analysis

Can internal product metrics reveal which components have been changed?

changes may go unnoticed=> false negatives are possible

all detected changes are real=> no false positives (but lot of noise)

Sometimes the kind of changes are revealing!

at the leaf of the hierarchy

in the middleof the hierarchy

change in “Hierarchy Nesting Level”

change in “Number of Children”

Identifying Refactorings via Change Metrics

• Refactorings Assumption Decreases (or Increases) in metric values indicate

restructuring

• Basic Principle of “Identify Refactorings” Heuristics Use one change metric as an indicator (1) Complement with other metrics to make the analysis more

precise Include other metrics for quicker assessment of the situation

before and after

(1) Most often we look for decreases in size, as most refactorings redistribute functionality by splitting components.

• See “Finding Refactorings via Change Metrics” in OOPSLA’2000 Proceedings [Deme00a]

Move to Superclass, Subclass or Sibling

• RecipeUse decreases in “# methods” (NOM), “# instance

attributes” (NIA) and “# class attributes” (NCA) as main indicator

Select only the cases where “# immediate children” (NOC) and “Hierarchy Nesting Level” (HNL) remains equal (otherwise it is a case of “split class”)

MOVE Move from B to A’, C’ or D’( (delta_NOM(B’) < 0)

or (delta_NIA(B’) < 0)or (delta_NCA(B’) < 0))

and (delta_HNL(B’) = 0)and (delta_NOC(B’) = 0)D

AA

B

CC

BD

Conclusion: Identifying Refactorings

• Can internal product metrics reveal which refactorings have been applied?

vulnerable to renamingimprecise for many changesrequires experienceconsiderable resources

=> inherent to reverse engineering based on source code

good focus (scaleability)reliablereveals class interactionunbiased

=> good in the early stages

ConclusionCan metrics (1) help to answer the following

questions?• Which components have good/poor quality?

Not reliably

• Which components did change? Yes

• Which refactorings have been applied? Yes

(1) Metrics = Measure internal product attributes (i.e., size, inheritance, coupling, cohesion,...)

Avoid Metric PitfallsComplexity

Complex metrics require more computation and more interpretation

Prefer simple metrics that are easy to collect and interpret

Avoid thresholds Thresholds hide the true nature of the system Prefer browsing/visualisation as a way to filter large

amounts of information

• Composition Composed metrics hide their components, which is

difficult to interpret Show composed metrics side by side with its

components Visualize metrics

7. metrics in reengineering context outline introduction metrics and measurements some metrics...

Documents

metrics conclusion slide

measurement metrics

definitions metrics

simple metrics

engineered metrics

cause slide

class p

class q