Download - Fse2012 shihab
An Industrial Case Study on the Risk of Software Changes
Emad Shihab, Ahmed E. Hassan, Bram Adams and Zhen Ming Jiang
7
Prior Approach is Not Adding Value!
Too much to review
About risk, not just bugs
Not assignable
Too late
9
Overview of Change Integration Process
Local Repository
Risky?
Yes
Closer review
No
Main Repository
Change
10
Case Study
Commercial mobile system
Dec 2009 – Dec 2010450+ developers60+ teams7000+ changesMainly in Java and C/C++
11
Part 1 Part 2 Part 3Prediction of Risky Changes
UnderstandingRisky Changes
Misclassification of Risky Changes
Study Overview
12
An Example Change
Change 12345 by author@adesk on 2000/03/23 12:47:15Purpose: Bug fixModifies API: YesRelated Changes: 1234, 3421…Change description: Changed files A and B to implement new feature and fix bug 123 ...
Files affected://root/comp1/subcomp1/A.java (+10, -1, e10)//root/comp1/subcomp1/B.cpp (+1, -2, e5)
Risky?
13
Prediction Using Logistic Regression(simplified)
Outcome = Const + β1 factor 1+ β2 factor2+ β3 factor 3
.
.+ βn factor n
Risky? Input Factors
Factors Used to Model Risky Changes
Lines and chunks add, deleted, modified, total churn
No. of Files, No. of changes, No. of fixes, bugginess, No. developers
Developer experience, Bug fix?, No. linked bugs
Changed files
Experience & Defects
Code
Size
Modify Java, CPP, other, API
15
Evaluating Prediction Models
Actually Risky
Predicted Risky
TP
FP
FN
Precision: “How small is FP”
Recall: “How small is FN”
16
Developer Team
Precision
Recall67.9%67.6%
+37%+87%
Prediction Results
37-87% improvement in precision, 67% recall
17
Part 1 Part 2 Part 3Prediction of Risky Changes
UnderstandingRisky Changes
Misclassification of Risky Changes
Study Overview
18
Most Important Factors
Developer Team
7 X Lines Added 10 X Chunks Added
7 X File bugginess 6 X File bugginess
None 3 X Modifies C++
2 X No. linked bugs1 X Developer experience
4 X No. linked bugs4 X Developer experience
Code added, file bugginess, No. linked defects and developer experience
19
Part 1 Part 2 Part 3Prediction of Risky Changes
UnderstandingRisky Changes
Misclassification of Risky Changes
Study Overview
20
When were Developers Wrong?
Compare percentage of correctly and wrongly classified changes:
• Cause: Unclear requirements, inadequate testing, coding errors, design flaw
• Related changes?
• Modifies API code?
Changes that have related changes are 10 times more likely to be wrongly classified!
21
Success Story!
A tool based on this work is being used by RIM’s Handheld Integration Team
Tools team is working on building a tool to be deployed company wide