7. metrics in reengineering context outline introduction metrics and measurements some metrics...
Post on 20-Dec-2015
232 views
TRANSCRIPT
7. Metrics in Reengineering Context
Outline• Introduction• Metrics and
Measurements• Some metrics• Conclusion
The Reengineering Life-Cycle
Requirements
Designs
Code
(0) requirementanalysis
(1) modelcapture
(2a) problemdetection (3) problem
resolution
(4) Code Transformation
(2) Problem detectionissuesAcademic vs. Practical
(2b) Reverse Engineering
RoadMap• Some definitions• Metrics and Measurement• Metrics for reverse engineering• Selection of OO metrics • Metrics for trends analysis• Step back and look
Why Metrics in OO Reengineering?
Estimating CostIs it worthwhile to reengineer, or is it
better to start from scratch? Not covered in this lecture but… Difficult:
Company should keep data: heavy process Good data not Cobol figures for C# project
Why Metrics in OO Reengineering (ii)?
• Assessing Software QualityWhich components have poor quality?
(Hence could be reengineered)Which components have good quality?
(Hence should be reverse engineered) Metrics as a reengineering tool!
• Controlling the Reengineering ProcessTrend analysis: which components did
change?Which refactorings have been applied? Metrics as a reverse engineering tool!
Quantitative Quality Model (i)
• Quality according to ISO 9126 standardDivide-and conquer approach via
“hierarchical quality model”Leaves are simple metrics, measuring
basic attributes Not really useful but worth to know
Quantitative Quality Model (ii)
SoftwareQuality
Functionality
Reliability
Efficiency
Usability
Maintainability
Portability
ISO 9126 Factor Characteristic Metric
Error tolerance
Accuracy
Simplicity
Modularity
Consistency
defect density= #defects / size
correction impact= #components
changed
correction time
Product & Process Attributes
Product AttributeDefinition: measure aspects of artifacts delivered to the customerExample: number of system defects perceived, time to learn the system
Process AttributeDefinition: measure aspects of the process which produces a productExample: time to correct defect, number of components changed per correction
External & Internal Attributes
External AttributeDefinition: measures how the product/process behaves in its environmentExample: mean time between failure, #components changed
Internal AttributeDefinition: measured purely in term of the product, separate from its behaviour in contextExample: class coupling and cohesion, method size
External vs. Internal Product Attributes
External Internal
Advantage: • close relationship with quality
factors
Disadvantage:• relationship with quality
factors is not empirically validated
Disadvantages:•measure only after the
product is used or process took place
•data collection is difficult; often involves human intervention/interpretation
•relating external effect to internal cause is difficult
Advantages:• can be measured at any
time• data collection is quite
easy and can be automated
• direct relationship between measured attribute and cause
Metrics and Measurements• [Wey88] defined nine properties that a software
metric should hold. Read [Fenton] for critiques. • For OO only 6 properties are really interesting
[Chid 94, Fenton]1. Noncoarseness:
• Given a class P and a metric m, another class Q can always be found such that m (P) m(Q)
• not every class has the same value for a metric
2. Nonuniqueness. • There can exist distinct classes P and Q such that m(P) =
m(Q)• two classes can have the same metric
3. Monotonicity• m(P) m (P+Q) and m(Q) m (P+Q), P+Q is the
“combination” of the classes P and Q.
Metrics and Measurements (ii)4. Design Details are Important
• The specifics of a class must influence the metric value. Even if a class performs the same actions details should have an impact on the metric value.
5. Nonequivalence of Interaction• m(P) = m(Q) m(P+R) = m(Q+R) where R is an
interaction with the class.
6. Interaction Increases Complexity• m(P) + (Q) < m (P+Q). • when two classes are combined, the interaction
between the too can increase the metric value
Conclusion: Not every measurement is a metric.
Selecting Metrics• Fast
Scalable: you can’t afford log(n2) when n >= 1 million LOC
• Precise (e.g. #methods — do you count all methods, only
public ones, also inherited ones?) Reliable: you want to compare apples with apples
• Code-based Scalable: you want to collect metrics several times Reliable: you want to avoid human interpretation
• Simple (e.g. average number of arguments vs. locality of data
[LD = SUM |Li | / SUM |Ti |] ) Reliable: complex metrics are hard to interpret
Metrics for Reverse Engineering• Size of the system, system entities
Class size, method size, inheritance The intuition: a system should not contain too much big
entities really big entities may be problematic can be really difficult
and complex to understand
• Cohesion of the entities Class internals The intuition: a good system is composed by cohesive
entities
• Coupling between entities Within inheritance: coupling between class-subclass Outside of inheritance The intuition: the coupling between entities should be limited
Sample Size and Inheritance Metrics
Class
AttributeMethodAccess
Invoke
BelongTo
Inherit
Inheritance Metricshierarchy nesting level (HNL)# immediate children (NOC)# inherited methods, unmodified (NMI)# overridden methods (NMO)
Class Size Metrics# methods (NOM)# instance attributes (NIA, NCA)# Sum of method size (WMC)
Method Size Metrics# invocations (NOI)# statements (NOS)# lines of code (LOC)
Sample class Size• (NIV)
[Lore94] Number of Instance Variables (NCV) [Lore94] Number of Class Variables (static)
(NOM) [Lore94] Number of Methods (public, private,
protected) (E++, S++)
• (LOC) Lines of Code• (NSC) Number of semicolons [Li93]->
number of Statements • (WMC) [Chid94] Weighted Method Count
WMC = SUM ci where c is the complexity of a method (number
of exit or McCabe Cyclomatic Complexity Metric)
Class Complexity• (RFC) Response For a Class
[Chid94]Response Set for a Class (RS) is the
set of methods that can be executed in response to a message.
RS = {M} U {Ri}, RFC = | RS |
where {Ri} is the set of methods called by method i and {M} the set of all the methods in the class.
Hierarchy Layout• (HNL) [Chid94] Hierarchy Nesting Level ,
(DIT) [Li93] Deep of Inheritance Tree, • HNL, DIT = max hierarchy level• (NOC) [Chid94] Number of Children • (WNOC) Total number of Children • (NMO, NMA, NMI, NME) [Lore94] Number
of Method Overridden, Added, Inherited, Extended (super call)
• (SIX) [Lore94]SIX (C) = NMO * HNL / NOMWeighted percentage of Overridden Methods
Method Size• (MSG) Number of Message Sends• (LOC) Lines of Code• (MCX) Method complexity
Total Number of Complexity / Total number of methods
API calls= 5, Assignment = 0.5, arithmetics op = 2, messages with params = 3....
Sample Metrics: Class Cohesion
• (LCOM) Lack of Cohesion in Methods [Chid94] for definition[Hitz95a] for critique
Ii = set of instance variables used by method Mi
let P = { (Ii , Ij ) | Intersection (Ii , Ij ) is Empty,
Q = { (Ii , Ij ) | Intersection (Ii , Ij ) is not Emptyif all the sets are empty, P is emptyLCOM =|P| - |Q| if |P|>|Q|
= 0 otherwise• Tight Class Cohesion (TCC)• Loose Class Cohesion (LCC)
[Biem95a] for definitionMeasure method cohesion across invocations
Sample Metrics: Class Coupling (i)
• Coupling Between Objects (CBO)[Chid94a] for definition, [Hitz95a] for a discussion Number of other classes to which it is coupled
• Data Abstraction Coupling (DAC)[Li93a] for definition Number of ADT’s defined in a class
• Change Dependency Between Classes (CDBC)[Hitz96a] for definition Impact of changes from a server class (SC) to a
client class (CC).
Sample Metrics: Class Coupling (ii)
• Locality of Data (LD)[Hitz96a] for definitionLD = SUM |Li | / SUM |Ti | Li = non public instance variables +
inherited protected of superclass
+ static variables of the classTi = all variables used in Mi, except
non-static local variablesMi = methods without accessors
Metrics? Stepping Back• About the impact of the computation• Examples:
number of attributes should we count private attributes in NIV? Why not?
number of methods (private, protected, public, static, instance, operator, constructeurs, friends)
• What to do? Try first simple metrics, with simple extraction Take care about absolute threshold Metrics are good as a differential Metrics should be etalonned Do not numerically combine them: what is the
multiplication of oranges and apples: Jam!
20%/80%• Take care of thresholds
Average line number in Smalltalk 7 lines
So what?
• 20% outliers for 80% ok
“Define your own” Quality Model• Define the quality model with the development
team• Team chooses the characteristics, design
principles, internal product metrics...• ... and the thresholds
Maintainability
Factor Characteristic Design Principle Metric
Modularity
design class as an abstract data-type
encapsulate all attributes
avoid complex interfaces
number of private attributes ]2, 10[
number of public attributes ]0, 0[
number of public methods ]5, 30[
average number of arguments [0, 4[
Conclusion: Metrics for Quality Assessment• Can internal product metrics reveal which components have
good/poor quality?• Yes, but...
Not reliable• false positives: “bad” measurements, yet good quality• false negatives: “good” measurements, yet poor quality
Heavy Weight Approach• Requires team to develop (customize?) a quantitative quality model• Requires definition of thresholds (trial and error)
Difficult to interpret• Requires complex combinations of simple metrics
• However... Cheap once you have the quality model and the thresholds Good focus (± 20% of components are selected for further
inspection)• Note: focus on the most complex components first!
Trend Analysis via Change Metrics• Change Metric
Definition: difference between two metric values for the same metric and the same component in two subsequent releases of the software system
• Examples: difference between number of methods for class “Event” in release 1.0 and 1.1 difference between lines of code for method
“Event::process()” in release 1.0 and 1.1
• Change Assumption Changes in metric values indicate changes in the system
Conclusion: Metrics for Trend Analysis
Can internal product metrics reveal which components have been changed?
changes may go unnoticed=> false negatives are possible
all detected changes are real=> no false positives (but lot of noise)
Sometimes the kind of changes are revealing!
at the leaf of the hierarchy
in the middleof the hierarchy
change in “Hierarchy Nesting Level”
change in “Number of Children”
Identifying Refactorings via Change Metrics
• Refactorings Assumption Decreases (or Increases) in metric values indicate
restructuring
• Basic Principle of “Identify Refactorings” Heuristics Use one change metric as an indicator (1) Complement with other metrics to make the analysis more
precise Include other metrics for quicker assessment of the situation
before and after
(1) Most often we look for decreases in size, as most refactorings redistribute functionality by splitting components.
• See “Finding Refactorings via Change Metrics” in OOPSLA’2000 Proceedings [Deme00a]
Move to Superclass, Subclass or Sibling
• RecipeUse decreases in “# methods” (NOM), “# instance
attributes” (NIA) and “# class attributes” (NCA) as main indicator
Select only the cases where “# immediate children” (NOC) and “Hierarchy Nesting Level” (HNL) remains equal (otherwise it is a case of “split class”)
MOVE Move from B to A’, C’ or D’( (delta_NOM(B’) < 0)
or (delta_NIA(B’) < 0)or (delta_NCA(B’) < 0))
and (delta_HNL(B’) = 0)and (delta_NOC(B’) = 0)D
AA
B
CC
BD
Conclusion: Identifying Refactorings
• Can internal product metrics reveal which refactorings have been applied?
vulnerable to renamingimprecise for many changesrequires experienceconsiderable resources
=> inherent to reverse engineering based on source code
good focus (scaleability)reliablereveals class interactionunbiased
=> good in the early stages
ConclusionCan metrics (1) help to answer the following
questions?• Which components have good/poor quality?
Not reliably
• Which components did change? Yes
• Which refactorings have been applied? Yes
(1) Metrics = Measure internal product attributes (i.e., size, inheritance, coupling, cohesion,...)
Avoid Metric PitfallsComplexity
Complex metrics require more computation and more interpretation
Prefer simple metrics that are easy to collect and interpret
Avoid thresholds Thresholds hide the true nature of the system Prefer browsing/visualisation as a way to filter large
amounts of information
• Composition Composed metrics hide their components, which is
difficult to interpret Show composed metrics side by side with its
components Visualize metrics