metrics and experimentationtddd30/material/03metrics.pdf · 2016. 11. 7. · theoretical validation...

48
Metrics and experimentation Kristian Sandahl

Upload: others

Post on 04-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Metrics and experimentationKristian Sandahl

Page 2: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

IntroductionMotivation:• Management:

– Appraisal– Assurance– Control– Improvement

• Research:– Cause-effect models

Terms:• Metric• Measurement

2016-11-07Metrics & Experiments/Kristian Sandahl 2

Page 3: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Classification• Product metrics:

– Observable or computed properties of the product– Examples: Lines of code, number of pages

• Process metrics:– Properties of how you are developing the product– Examples: Cycle time for a change request, number of

parallel activities• Resource metrics:

– Properties and volumes of the instruments you are using when developing the product

– Examples: Years of education, amount of memory in testing environment

2016-11-07Metrics & Experiments/Kristian Sandahl 3

Page 4: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

ScalesNominal = , ≠ Categories Type of

software

Ordinal < , > Rankings Skill rating: high, medium, low

Interval + , - Differences Project delay

Ratio / Absolute zero

Lines of code

Examples

2016-11-07Metrics & Experiments/Kristian Sandahl 4

Page 5: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Theoretical validation of metricsRepresentational theory, based on the mapping between

attributes of real-world entities – numerical values and units:

• For an attribute to be measurable, it must allow different entities to be distinguished from one another.

• A valid measure must obey the representational condition.• Each unit of an attribute contributing to a valid measure is

equivalent.• Different entities can have the same attribute value.Property-based theory, based on graph-theoretic models of

software modules:• Examples: Nonnegativity, Null value, Additivity

2016-11-07Metrics & Experiments/Kristian Sandahl 5

Page 6: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Structural model of measurement

2016-11-07Metrics & Experiments/Kristian Sandahl 6

Page 7: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Empirical (external) validation of metrics• Correlation between internal and external attributes

• Cause-effect models

• Handle bias

• Statistical analysis

2016-11-07Metrics & Experiments/Kristian Sandahl 7

Page 8: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Goal-Question-Metric (GQM)

2016-11-07Metrics & Experiments/Kristian Sandahl 8

Page 9: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Halstead’s software science1/2

The measurable and countable properties are :

• n1 = number of unique or distinct operators appearing in that implementation

• n2 = number of unique or distinct operands appearing in that implementation

• N1 = total usage of all of the operators appearing in that implementation

• N2 = total usage of all of the operands appearing in that implementation http://yunus.hacettepe.edu.tr/~sencer/complexity.html

2016-11-07Metrics & Experiments/Kristian Sandahl 9

Page 10: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Halstead’s software science2/2

Equations:

• Vocabulary n = n1 + n2

• Implementation length N = N1 + N2

• Length equation: N ' = n1log2n1 + n2log2n2

• Program Volume V = Nlog2n

• Potential Volume V ' = ( n*1 + n*2 ) log2 ( n*1 + n*2 )

• Program Level L = V ‘/ V

• L ' = n*1n2 / n1N2

• Elementary mental discriminations E = V / L = V2 / V '

• Intelligence Content I = L ' x V = ( 2n2 / n1N2 ) x (N1 + N2)log2(n1 + n2)

• Time T ' = ( n1N2( n1log2n1 + n2log2n2) log2n) / 2n2S

2016-11-07Metrics & Experiments/Kristian Sandahl 10

Page 11: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Code metrics in Visual Studio• Lines Of Code

• Cyclomatic Complexity

• Maintainability Index = 171–5.2*ln(aveV)–0.23*ave(g’)– 16.2*ln(aveLOC)

• Depth Of Inheritance

• Class Coupling

2016-11-07Metrics & Experiments/Kristian Sandahl 11

Page 12: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Function Points - Background• First suggested by Albrecht 1979• Captures complexity and size• Language independent• Can be used before implementation• Used as input for estimation• Common versions IFPUG v 4.x• Competitor MARK II:

– simpler to count– has finer granularity– is a continuous measure

• A “closed community”• Traditionally used for business systems

2016-11-07Metrics & Experiments/Kristian Sandahl 12

Page 13: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

COSMIC-FFP(COmmon Software Measurement International Consortium Full Function Point)

• An ISO-approved method for calculating FP for embedded, real-time systems

• Partitions the system in Functional User Requirements (FUR)

2016-11-07Metrics & Experiments/Kristian Sandahl 13

Page 14: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Example: Change customer data in a warehouse of items

User entry Entry 1Retrieve customer data

Read 1

Display error message

Exit 1

Display customer data

Exit 1

Enter changed data Entry 1

Retrieve item data Read 1

Store item data Write 1

Store modified data Write 1

Total Cfsu 8

2016-11-07Metrics & Experiments/Kristian Sandahl 14

Page 15: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Connections to other methods• Mapping to UML – Use cases as Sequence diagrams,

count messages

• Cfsu = C1 + C2 FP, for less than 100 Cfsu

• C2 1.1-1.2

• C1 varies

2016-11-07Metrics & Experiments/Kristian Sandahl 15

Page 16: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Why do we need experimental studies?Software Engineering has great variation in:• Scale• Domain• Tools• Infrastructure• Human resources• Organization• Locality• Technique• Quality Method

Cause Effect

2016-11-07Metrics & Experiments/Kristian Sandahl 16

Page 17: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

What is an experiment?

treatment

No treatment

Units (subjects)

Control group

Comparison

2016-11-07Metrics & Experiments/Kristian Sandahl 17

Page 18: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Types of experiments• Randomized experiment: Units receiving the

treatment are selected by random

• Quasi-experiment: Units are not selected randomly

• Controlled experiment: Comparison between treatments (Sjøberg et al 2005)

• Correlation study: Observes relationships with variables (empirical evaluation)

• Replication: Repeating the study

• Differentiated replication: Replication with variation of essential conditions

2016-11-07Metrics & Experiments/Kristian Sandahl 18

Page 19: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Variables

AgeSexEducationExperience...

Background variables

Controlled variables

Independentvariables

Time of dayTemperatureAvailable resources...

Method usedTool usedSize of taskGroup size...

Dependent variables

No of errors doneTime to complete taskJudgement of quality...

DifferentNot changeable

SameObserved

ManipulatedObserved

Assumed to changeas effect manipulationof independent variables

2016-11-07Metrics & Experiments/Kristian Sandahl 19

Page 20: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Validity threats• Internal validity: Are differences in dependent

variables really due to changes of independent variables?

• Conclusion validity: Are our measurement and analysis methods appropriate?

• Construct validity: Are we measuring the phenomena we intend to do?

• External validity: To what population can we generalise our results?

2016-11-07Metrics & Experiments/Kristian Sandahl 20

Page 21: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Comparing means

Under certain conditions: Student’s t-test

Significance level: nomally 5%

2016-11-07Metrics & Experiments/Kristian Sandahl 21

Page 22: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Comparing distributions• Are the testers’

methods the same?

• Under certain conditions: use the Chi-square test

• For 2x2 contingency tables other methods apply, for instance Cohen’sKappa

2016-11-07Metrics & Experiments/Kristian Sandahl 22

Page 23: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

The box plot

2016-11-07Metrics & Experiments/Kristian Sandahl 23

Page 24: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Comparing variance

2016-11-07Metrics & Experiments/Kristian Sandahl 24

Page 25: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Linear regression

2016-11-07Metrics & Experiments/Kristian Sandahl 25

Page 26: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

ScopePrediction of:

• Resources

• Calendar time

• Quality (or lack of quality)

• Change impact

• Process performance

• Often confounded with the decision process

2016-11-07Metrics & Experiments/Kristian Sandahl 26

Page 27: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Historical data2016-11-07Metrics & Experiments/Kristian Sandahl 27

Page 28: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Methods for building prediction models• Statistical

– Parametric• Make assumptions about distribution of the variables• Good tools for automation• Linear regression, Variance analysis, ...

– Non-parametric, robust• No assumptions about distribution• Less powerful, low degree of automation• Rank-sum methods, Pareto diagrams, ...

• Causal models– Link elements with semantic links or numerical equations– Simulation models, connectionism models, genetic models, ...

• Judgemental– Organise human expertise– Delphi method, pair-wise comparison, Lichtenberg method

2016-11-07Metrics & Experiments/Kristian Sandahl 28

Page 29: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

The Lichtenbeg method process• Staff the analysis group• Describe the work to be estimated• Define general constraints and assumptions• Define the structure• Individual judgement of MIN, MAX, LIKLEY• Calculate common result• Find workpagages with large variance• Sub-devide them and rework

• 5-20 participants• Never influence each others judgements• MIN and MAX should be extreme – 1% of the cases

2016-11-07Metrics & Experiments/Kristian Sandahl 29

Page 30: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Common SE-predictions

• Detecting fault-prone modules• Project effort estimation• Change Impact Analysis• Ripple effect analysis• Process improvement models• Model checking• Consistency checking

Page 31: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Common metrics

2016-11-07Metrics & Experiments/Kristian Sandahl 31

Page 32: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Introduction• There are many faults in software

• Faults are costly to find and repair

• The later we find faults the more costly they are

• We want to find faults early

• We want to have automated ways of finding faults

• Our approach

– Automatic measurements on models

– Use metrics to predict fault-prone modules

2016-11-07Metrics & Experiments/Kristian Sandahl 32

Page 33: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Approach• Find metrics (independent variables)

– Number of model elements (size)

– Number of changed methods (change)

– Transitions per state (complexity)

– Changed operations * transitions per state (combinations)

– ...

• Use metrics to predict (dependent variable)

– Number of TRs

2016-11-07Metrics & Experiments/Kristian Sandahl 33

Page 34: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Capsules

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 35: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

State charts

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 36: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

package

capsule class

attribute operationport protocol

signalState machine

State transition

Data model

2016-11-07Metrics & Experiments/Kristian Sandahl 36

Page 37: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Our project - modelmet• RNC application - Three releases• About 7000 model elements• TR statistics database (2000 TRs)• Find metrics

– Existing metrics (done at standard daily build)– Run scripts on models

• Statistical analysis– Linear regression, principal component analysis,

discriminant analysis, robust methods– Neural networks, Bayesian belief networks

2016-11-07Metrics & Experiments/Kristian Sandahl 37

Page 38: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Size

Change

Complexity

Combined

2016-11-07 38

Page 39: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Metrics based on change, system A

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 40: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Metrics based on change, system B

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 41: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Complexity and size metrics, system A

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 42: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Complexity and Size metrics, system B

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 43: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Other metrics, system A

TRD = C + 0.034 states – 0.965 protocolsmodelelements

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 44: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

Other metrics, system B

2016-11-07

Metrics & Experiments/Kristian Sandahl

Page

Page 45: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

2016-11-07Metrics & Experiments/Kristian Sandahl 45

Page 46: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

How to use predictions• Uneven distribution of faults is common – 80/20 rule

• Perform special treatment on selected parts– Select experienced designers– Provide good working conditions– Parallell teams– Inspections– Static and dynamic analysis tools– ...

• Perform root-cause analysis and make corrections

2016-11-07Metrics & Experiments/Kristian Sandahl 46

Page 47: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

ResultsContributions:• Valid statistical material:

– Large models, large number of TRs– Two change projects

• Two highly explanatory predictors were found• State chart metrics are as good as OO metrics

Problems:• Some problems to match modules in models and TRs• Effort to collect change data

2016-11-07Metrics & Experiments/Kristian Sandahl 47

Page 48: Metrics and experimentationTDDD30/material/03Metrics.pdf · 2016. 11. 7. · Theoretical validation of metrics Representational theory, based on the mapping between attributes of

www.liu.se

Metrics and experimentation/Kristian Sandahl