regression testing: theory and practice · 2015-06-16 · example from apache camel: commit...
TRANSCRIPT
REGRESSION TESTING: THEORY AND PRACTICE
Software Bugs Lead to Financial Losses or Loss of Life
https://issues.openmrs.org/browse/TRUNK-4475
Boeing’s avionics software
Medical record system
Knight’s bug ($440 million)
2
Add Tests Run Tests
Software Testing Lifecycle
Assess Tests
3
Assess Tests
Test Generation Run Tests
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Automated Test Generation
4
Assess Tests
Test Generation Run Tests
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Bugs Found in Widely Used Projects
5
Test Generation Run Tests
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Metrics for Assessing Test Quality
[Mutation10,ISSTA13]
[FSE11,ISSTA13*,ASE13,TOSEM15]
Test Quality Assessment
6
Test Generation Regression Testing
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Significantly Faster Regression Testing
[Mutation10,ISSTA13]
[ICST10*,STVR13]
[FSE11,ISSTA13*,ASE13,TOSEM15]
[ASE11,CAV14,ASE14,OOPSLA14,FSE14]
Test Quality Assessment
7
Test Generation Regression Testing
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Concurrent Code Analysis
[Mutation10,ISSTA13]
[ICST10*,STVR13]
[FSE11,ISSTA13*,ASE13,TOSEM15]
[ASE11,CAV14,ASE14,OOPSLA14,FSE14]
Test Quality Assessment Concurrent Code Analysis
[ICSE08,IWMSE10,Scala11,FSE11,ICST12*,TACAS13,Onward!13]
8
Test Generation Regression Testing
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Impact Outside of Academia
[Mutation10,ISSTA13]
[ICST10*,STVR13]
[FSE11,ISSTA13*,ASE13,TOSEM15]
[ASE11,CAV14,ASE14,OOPSLA14,FSE14]
Test Quality Assessment Concurrent Code Analysis
[ICSE08,IWMSE10,Scala11,FSE11,ICST12*,TACAS13,Onward!13]
9
Test Generation Regression Testing
[ICST09,ICSE10*,ISSTA11,ECOOP13]
[CSTVA10,FASE11]
*Paper won an award or invited for a journal publication: ICSE10, ICST10, ICST12, ISSTA13
Concurrent Code Analysis
Today’s Focus
[ICSE08,IWMSE10,Scala11,FSE11,ICST12*,TACAS13,Onward!13]
[Mutation10,ISSTA13]
[ICST10*,STVR13]
[FSE11,ISSTA13*,ASE13,TOSEM15]
[ASE11,CAV14,ASE14,OOPSLA14,FSE14]
Test Quality Assessment
10
Regression Testing
• Executes tests for each new code revision
• Checks if changes broke something
• Widely used in industry
original revision modified revision
changes
Ava
ilab
le t
ests t1
t2
t3
tn
…
t1
t2
t3
tn
…
11
Regression Testing – Costly (1)
~5min
~10min
~45min
1296
361
~4h
~17h
Ru
n m
any
tim
es
eac
h d
ay1667
641534
~45min
~45min
631
test execution time
4975
number of tests
866312
Regression Testing – Costly (2)
linear increase in the number of revisions per daylinear increase in the number of tests per revision
=> quadratic increase in test execution time75+ million tests run per day20+ revisions per minute
*
*http://google-engtools.blogspot.com/2011/06testing-at-speed-and-scale-of-google.html
Personal experience
13
Regression Test Selection (RTS)
• Speeds up regression testing– Without requiring more computers or energy
• Analyzes changes to a codebase
• Runs only tests whose behavior may be affected
all affected tests =>
safe test selection
original revision modified revision
changes
rts
Ava
ilab
le t
ests t1
t2
t3
tn
…
t1
t2
t3
tn
…
14
RTS – Example
changes to 𝐶2, 𝐶3
C1 C2 C3 f
t1
t2
t3
t4
C1 C2 C3 f
t1
t2
t3
t4
rts(original,modified)
original revision modified revision
15
t1() {C1 obj = new C1();assert(obj.m() == 1);
}
class C1 {int m() { return 1; }
}
Outline
• Theory: Regression test selection for distributed software history
• Technique: Safe and efficient regression test selection for object-oriented languages
• System: Ekstazi tool for Java
16
Distributed Software Histories
• Distributed version control systems (e.g., Git)
• Complex DAGs due to branches, merges, etc.
• ~35% of revisions are merges
Branch
Merge
17
Distributed Software History: Explained
• Commit– Extends graph
with a new edge
• Merge– Joins two or more revisions
• Revert– Undoes a prior commit
• Cherry pick– Applies a change from one branch to another
1 2
3 4
5 6
0
CD
D E
C
D
h
How to do regression test selection (RTS) for all commands in distributed software history?
18
RTS for Commit Command
• Based on test selection between two revisions
1 2
3 4
5 6
0
C D E
t1
t2
t3
t4
𝑆𝑐𝑜𝑚𝑚𝑖𝑡 ℎ = 𝑟𝑡𝑠(𝑝𝑟𝑒𝑑 ℎ , ℎ)
CD
D E
C
D
t1,t4 t2,t4
t2,t4 t3
t1,t4 t2,t4
t1,t2,t3,t4
19
Merge Command: Option S1 (1/3)
1 2
3 4
5 6
0
C D E
t1
t2
t3
t4
CD
D E
C
D
h
t1,t4 t2,t4
t2,t4 t3
t1,t4 t2,t4
t1,t2,t3,t4t1,t2,t3,t4
C,D,E
0
𝑆𝑚𝑒𝑟𝑔𝑒1 ℎ = 𝑟𝑡𝑠(𝑖𝑚𝑑 ℎ , ℎ)
Pro: Runs test selection only once (i.e., relatively fast)
Con: There may be many changes between imd(h) and h =>many tests selected to run (i.e., slow)
imd – immediate dominator
20
6
4
6
Merge Command: Option Sk (2/3)
1 2
3
5
0
C D E
t1
t2
t3
t4
CD
D E
C
D
h
t1,t4 t2,t4
t2,t4 t3
t1,t4 t2,t4
t1,t2,t4t1,t2,t3,t4
4
2
C,D,E => t1,t2,t3,t4
C,D => t1,t2,t4
C,D,E => t1,t2,t3,t4
If a test is not affected between a parent and merge revisions,take the result from the parent
Pro: Selects fewer tests than 𝑆1
Con: Runs test selection k times (i.e., for each parent)
𝑆𝑚𝑒𝑟𝑔𝑒𝑘 =
𝑛∈𝑝𝑟𝑒𝑑(ℎ)
𝑟𝑡𝑠(𝑛, ℎ)
pred – predecessor nodes
21
6655
4433
Merge Command: Option S0 (3/3)
1 2
0
C D E
t1
t2
t3
t4
CD
D E
C
D
h
t1,t4 t2,t4
t2,t4 t3
t1,t4 t2,t4
t1,t2,t4t1,t2,t3,t4
2
t2,t4
t1,t2,t4
t2,t41
Con: Selects more tests than Sk (e.g., new tests in one of the branches)
Pro: Does not run test selection, but uses history resultsIf a test is affected on multiple branches,changes from different branches together may lead to diff result
𝑆𝑚𝑒𝑟𝑔𝑒0 ℎ = 𝑆𝑎𝑓𝑓(ℎ) ∪ 𝐴(ℎ)\
𝑝∈𝑝𝑟𝑒𝑑(ℎ)
𝐴(𝑝)
𝑆𝑎𝑓𝑓 ℎ =
𝑝,𝑝′∈𝑝𝑟𝑒𝑑 ℎ ,𝑝≠𝑝′,𝑑=𝑑𝑜𝑚(𝑝,𝑝′)
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∩
𝑛∈𝑑≤∗𝑝′\ 𝑑
𝑆𝑠𝑒𝑙(𝑛)
Merge Command: Comparison
𝑆𝑚𝑒𝑟𝑔𝑒1
𝑆𝑚𝑒𝑟𝑔𝑒𝑘
𝑆𝑚𝑒𝑟𝑔𝑒0
Analysis time Number of selected tests
Medium
Slow
Fast
Large
Medium
Small
(Naïve)
• The following relations holds
• If no new tests and reverts, the following holds
• S0 applicable for automerge (90%) and requires results for all revisions
𝑆𝑚𝑒𝑟𝑔𝑒𝑘 ⊆ 𝑆𝑚𝑒𝑟𝑔𝑒
0
𝑆𝑚𝑒𝑟𝑔𝑒𝑘 = 𝑆𝑚𝑒𝑟𝑔𝑒
0
𝑆𝑚𝑒𝑟𝑔𝑒𝑘 , 𝑆𝑚𝑒𝑟𝑔𝑒
1 are incomparable
23
Safety
• All our test selection algorithms are safe
• Proof 𝑆𝑚𝑒𝑟𝑔𝑒𝑘 ℎ ⊆ 𝑆𝑚𝑒𝑟𝑔𝑒
0 ℎ follows from– rts distributes over changes
– rts is monotonic with respect to the set of changes
– properties of automerge
Theorem 1: 𝑆𝑚𝑒𝑟𝑔𝑒𝑘 (ℎ) and 𝑆𝑚𝑒𝑟𝑔𝑒
1 (ℎ) are safe for every merge revision h
Theorem 2: 𝑆𝑚𝑒𝑟𝑔𝑒0 (ℎ) is safe for every automerge revision h
24
Revert Command
• Undoes a prior commit
𝑆𝑟𝑒𝑣𝑒𝑟𝑡𝑎𝑓𝑓(ℎ) = 𝑆𝑠𝑒𝑙(𝑝′, 𝑛𝑟𝑒) ∩
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∪
𝑛∈𝑑≤∗𝑝′\ 𝑑
𝑆𝑠𝑒𝑙 𝑛
𝑆𝑟𝑒𝑣𝑒𝑟𝑡0 ℎ = 𝑆𝑟𝑒𝑣𝑒𝑟𝑡
𝑎𝑓𝑓ℎ ∪ 𝐴 𝑝′ \A 𝑛𝑟𝑒 ∪ 𝐴 𝑝 \A 𝑑
1 2
4
5 6
0 hC -C
25
Cherry-pick Command
• Applies a change from one branch to another
𝑆𝑐ℎ𝑒𝑟𝑟𝑦𝑎𝑓𝑓(ℎ) = 𝑆𝑠𝑒𝑙(𝑛
′𝑐𝑝, 𝑛𝑐𝑝) ∩
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∪
𝑛∈𝑑≤∗𝑛′𝑐𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛
𝑆𝑐ℎ𝑒𝑟𝑟𝑦0 ℎ = 𝑆𝑐ℎ𝑒𝑟𝑟𝑦
𝑎𝑓𝑓ℎ ∪ 𝐴 𝑛𝑐𝑝 \A 𝑛
′𝑐𝑝 ∪ 𝐴 𝑝 \A 𝑑
1 2
3 4
5 6
0 h
C
C
26
From Theoretical to Applied RTS
𝑆𝑐ℎ𝑒𝑟𝑟𝑦𝑎𝑓𝑓(ℎ) = 𝑆𝑠𝑒𝑙(𝑛
′𝑐𝑝, 𝑛𝑐𝑝) ∩
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∪
𝑛∈𝑑≤∗𝑛′𝑐𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛
𝑆𝑐ℎ𝑒𝑟𝑟𝑦0 ℎ = 𝑆𝑐ℎ𝑒𝑟𝑟𝑦
𝑎𝑓𝑓(ℎ) ∪ 𝐴(𝑛𝑐𝑝)\A(𝑛
′𝑐𝑝)
𝑆𝑟𝑒𝑣𝑒𝑟𝑡𝑎𝑓𝑓(ℎ) = 𝑆𝑠𝑒𝑙(𝑝′, 𝑛𝑟𝑒) ∩
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∪
𝑛∈𝑑≤∗𝑝′\ 𝑑
𝑆𝑠𝑒𝑙 𝑛
SAFE + EFFICIENT
?
𝑟𝑡𝑠
27
Outline
• Theory: Regression test selection for distributed software history
• Technique: Safe and efficient regression test selection for object-oriented languages
• System: Ekstazi tool for Java
28
My RTS Technique
• Insight: test -> dynamically used files
• Safe by design for any code change
• Efficient due to these properties
– Small number of files are modified at each revision
– Small number of tests depends on each file
– Changes are localized
29
Fine-grained (Method) Dependencies
changes to 𝑝, 𝑟
C D E
m p q r
t1
t2
t3
t4
rts(original,modified)
original revision modified revision
C D E
m p q r
t1
t2
t3
t4
NOT SAFE30
Safety Example (1)
class A {A() {}int m() { return 1; }
}class B extends A {B() {}@Overrideint m() { return 2; }
}
class A {A() {}int m() { return 1; }
}class B extends A {B() {} // calls A()
}
test() {B b = new B();assert(b.m() == 1);
}
revision 0 revision 1
A() B() A.m()
test
A B
test 31
Safety Example (2)
test() {Method[] methods = A.class.getDeclaredMethods();assert(methods.length == 1);
}
class A {A() {}public void m() { … }
}
class A {A() {}public void m() { … }public void n() { … }
}
revision 0 revision 1
A() A.m()
test
A
test 32
Outline
• Theory: Regression test selection for distributed software history
• Technique: Safe and efficient regression test selection for object-oriented languages
• System: Ekstazi tool for Java
34
From Applied to Practical RTS
• Implemented for JVM languages
ekstazi.org
• Technical challenges
– Monitoring used classes
– Handling jar files
– Parallel execution
– No explicit comparison of two revisions
– Smart hashing
35
Evaluation – Summary
• More than 30 projects
• 773,565 tests
• ~5M LOC
• >500 revisions
36
Evaluation – Apache CXF
Reduces number of tests: ~15xReduces test execution time: ~8xReduces build+test time: ~3x
My recent work on faster building [OOPSLA’14] 37
Ekstazi Users
Example from Apache Camel: commit ff94895cDate: Thu Nov 13 09:17:06 2014 -0600
Including Ekstazi (www.ekstazi.org) profile to optimize execution of the tests
Zed - actuator services platform
JBoss Fuse examplesJBoss Operations Network
Proprietary banking software
39
Test Quality Assessment
Test Generation
Concurrent Code Analysis
Regression Testing
[ICST09,ICSE10,ISSTA11,ECOOP13,OOPSLA14]
[CSTVA10,FASE11]
[ASE11,CAV14,ASE14,FSE14]
[ICST10,STVR13]
[FSE11,ISSTA13,ASE13,TOSEM15] [ICSE08,IWMSE10,Scala11,FSE11,ICST12,TACAS13,Onward!13]
[Mutation10,ISSTA13]
Overview of My Research
43
• Goal: Automatically generate test inputs
– Data structures
– Compilers
– IDEs
– DOM parsers
• Challenges
– How to obtain large set of complex test inputs
– How to describe the set of test inputs
– How to efficiently generate test inputs from the description
Test Generation (1/2)
[ICST09,CSTVA10,ICSE10,ISSTA11,FASE11,ECOOP13]
class A {
int f;}
class B extends A {
void m() {
super.f = 0;}}
44
0
1
3
2
• Solution
– Java-based language with non-deterministic constructs
– Lightweight symbolic execution engine
• Results: short descriptions, detected many bugs in:
• Comparison with prior work: 50% shorter descriptions and an order of magnitude faster generation
Test Generation (2/2)
[ICST09,CSTVA10,ICSE10,ISSTA11,FASE11,ECOOP13]45
Model Checking Database Applications (1/2)
• Goal: Detect concurrency bugs in database (DB) applications (e.g., web servers)
• Challenges
– Explore state space of DB applications
– Avoid state-space explosion
[ICSE08,ICST10,Mutation10,IWMSE10,Scala11,FSE11,ICST12,TACAS13,STVR13,Onward!13,ISSTA13]46
Model Checking Database Applications (2/2)
• Solution
– Software model checker for DB applications
– Partial-order reduction at various levels of granularity
• e.g., insert and insert with and without constraints
• Results: Scalable model checking, detected problems in large systems
[ICSE08,ICST10,Mutation10,IWMSE10,Scala11,FSE11,ICST12,TACAS13,STVR13,Onward!13,ISSTA13]47
Test Quality Assessment Concurrent Code Analysis
Test Generation Regression Testing
Future Work (1)
Test input generation for evolving software
Incremental algorithms in DVCS
Cross-language regression testing
49
Future Work (2)
• Remain in software engineering and formal methods
• Testing and verification of emerging platforms
– Scalable model checking
– Performance and resilience testing
– Testing protocols and mocking
• Leverage cloud to speedup testing and verification
– Parallelizing analysis and execution phase
– Prediction models for regression runs
50
Gul Agha
Amin Alipour
Elton Alves
Andrea Arcuri
Sandro Badame
Farnaz Behrang
Marcelo d'Amorim
Lamyaa Eloussi
Gordon Fraser
Alex Groce
Tihomir Gvero
Alex Gyori
Munawar Hafiz
Daniel Jackson
Vilas Jagannath
Dongyun Jin
Ralph Johnson
Owolabi Legunsen
Sam Kamin
Sarfraz Khurshid
Viktor Kuncak
Steven Lauterburg
Yilong Li
Benjamin Livshits
Qingzhou Luo
Rupak Majumdar
Darko Marinov
Aleksandar Milicevic
Peter C. Mehlitz
Iman Narasamdya
Stas Negara
Jeffrey Overbey
Cristiano Pereira
Gilles Pokam
Chandra Prasad
Grigore Rosu
Wolfram Schulte
Rohan Sharma
Samira Tasharofi
Danny van Velzen
Andrey Zaytsev
Chaoqiang Zhang
51
Conclusions
• Improving software quality– Designed scalable algorithms and techniques with
theoretical foundation• Improved efficiency for regression testing, test generation,
concurrent code analysis
– Developed practical tools for the proposed techniques• Discovered many previously unknown (concurrency) bugs• Adopted outside of academia: Apache, Google, Microsoft
• Today’s talk: regression testing
52
𝑆𝑐ℎ𝑒𝑟𝑟𝑦𝑎𝑓𝑓(ℎ) = 𝑆𝑠𝑒𝑙(𝑛
′𝑐𝑝, 𝑛𝑐𝑝) ∩
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∪
𝑛∈𝑑≤∗𝑛′𝑐𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛
𝑆𝑐ℎ𝑒𝑟𝑟𝑦0 ℎ = 𝑆𝑐ℎ𝑒𝑟𝑟𝑦
𝑎𝑓𝑓(ℎ) ∪ 𝐴(𝑛𝑐𝑝)\A(𝑛
′𝑐𝑝)
𝑆𝑟𝑒𝑣𝑒𝑟𝑡𝑎𝑓𝑓(ℎ) = 𝑆𝑠𝑒𝑙(𝑝′, 𝑛𝑟𝑒) ∩
𝑛∈𝑑≤∗𝑝\ 𝑑
𝑆𝑠𝑒𝑙 𝑛 ∪
𝑛∈𝑑≤∗𝑝′\ 𝑑
𝑆𝑠𝑒𝑙 𝑛