an empirical analysis of build failures in the continuous integration workflows of java-based...

22
An Empirical Analysis of Build Failures in the Continuous Integration Worfklows of Java-Based Open-Source Soſtware Thomas Rausch, Waldemar Hummer, Philipp Leitner*, Stefan Schulte Distributed Systems Group Vienna University of Technology, Austria http://dsg.tuwien.ac.at * Soſtware Evolution and Architecture Lab University of Zurich, Switzerland http://www.ifi.uzh.ch/en/seal.html

Upload: thomas-rausch

Post on 22-Jan-2018

168 views

Category:

Science


0 download

TRANSCRIPT

Page 1: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

An Empirical Analysis of Build Failures in theContinuous Integration Worfklows

of Java-Based Open-Source Software

Thomas Rausch, Waldemar Hummer, Philipp Leitner*, Stefan Schulte

Distributed Systems GroupVienna University of Technology, Austria

http://dsg.tuwien.ac.at

* Software Evolution and Architecture LabUniversity of Zurich, Switzerland

http://www.ifi.uzh.ch/en/seal.html

Page 2: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

2

Continuous Integration

VCS

CI Server Build

Feedback

Logs

Vasilescu et al. (2015).Quality and Productivity Outcomes Relating to Continuous Integration in GitHub

“Our main finding is that continuous integration improves the productivity of project teams”

Kerzazi et al. (2014).Why do Automated Builds Break? An Empirical Study

“We [...] quantified the cost of such build breakage as more than 336.18 man-hours”

Page 3: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

3

Page 4: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

4

Related Work

Page 5: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

5

Understanding Build Failures

What types of errors cause CI build failures?

Which development practices can be associated with CI build failures?

Page 6: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

6

Research SettingProject Name Description

Apache Storm Distributed Computation

Butterknife Android Dependency Injection

Crate.IO Scalable SQL database

JabRef BibTeX management GUI

jcabi-github Wrapper of GitHub API

Hystrix Latency and fault tolerance library

Presto Distributed SQL query engine

Openmicroscopy Microscopy data environment

RxAndroid RxJava bindings for Android

Sponge API Minecraft plugin API

Spring Boot Java Application Framework

Square OkHttp HTTP+HTTP/2 client for Android

Square Retofit HTTP client for Android

Wordpress-Android WordPress for Android

Page 7: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

7

Data Acquisition

a

b

c

d

Topology Mapping

CI build history

Change history

Page 8: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

8

Understanding Build Failures

What types of errors cause CI build failures?

Which development practices can be associated with CI build failures?

Page 9: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

9

Error Categorization and Quantification Goal

○ Categorization of errors○ Frequency of occurrence of error types

Approach○ Systematic exploration of ~54 000 logfiles○ Categorization scheme based on log message patterns

[INFO] Compiling 67 source files to /home/travis/.../target/classes[INFO] -------------------------------------------------------------[ERROR] COMPILATION ERROR :[INFO] -------------------------------------------------------------[ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol[INFO] 1 error

[INFO] Compiling 67 source files to /home/travis/.../target/classes[INFO] -------------------------------------------------------------[ERROR] COMPILATION ERROR :[INFO] -------------------------------------------------------------[ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol[INFO] 1 error

[INFO] Compiling 67 source files to /home/travis/.../target/classes[INFO] -------------------------------------------------------------[ERROR] COMPILATION ERROR :[INFO] -------------------------------------------------------------[ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol[INFO] 1 error

Page 10: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

10

Error Categories

unknown Errors without a clearly identifiable cause 9

itestfailure An automated integration test failed 4

doc Documentation (e.g., JavaDoc) problem 3

license License criteria not met (missing header) 3

compatibility API incompatibility 2

androidsdk Android SDK-related error 1

buildout Error specific to Crate.IO python module 1

Label Description Occurrences

testfailure An automated test failed 12

compile Compilation error 12

git VCS interaction error 12

buildconfig Faulty build config 11

crash Build environment crash or timeout 11

dependency Dependency error 11

quality Coding-rule violation (e.g., Checkstyle) 10

Page 11: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

11

Distribution of Common Error Types

Faulty VCS interaction

Faulty build configuration Dependency

error

Compilation error

Coding-rule violation

Failing test

Crash

40%

30%

20%

10%

0%

Page 12: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

12

Distribution of Common Error Types

Apache StormButterknife

Crate.IOHystrix

Errortestfailure

compile

git dependency crash

buildconfig quality others

Percentage

JabRefjcabi-github

PrestoRxAndroidSpongeAPI

Spring BootSquare OkHttpSquare Retrofit

0% 25% 50% 75% 100%

Page 13: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

13

Understanding Build Failures

What types of errors cause CI build failures?

Which development practices can be associated with CI build failures?

Page 14: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

14

Change Metrics

.java .txt

Changes Complexity

○ Churn, number of files, ...

File types○ README.txt vs.

IntegrationTest.java

Date and time Author

○ Experience, commit frequency, ...

Page 15: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

15

Process Metrics

b1

b2

b4

t

b3

a

b

e

c

f

d

g

VCScommit graph

CI buildinformation

Build History○ Build climate

Build Type○ Pull request, merge, ...

Pull Request Scenarios○ Rebase, squash, ...

Page 16: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

16

Statistical Correlation Analysis

For each project individually Non-parametric correlation tests

○ Pearson’s chi-square test○ Mann—Whitney U test

Calculate effect sizes○ Cramér’s V○ Rank-biserial correlation

Page 17: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

17

PassedBuild outcome Failed

Failed Passed

Previous build result

Perc

enta

ge o

f bui

lds

Findings

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build history

b

b’

Page 18: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

18

PassedBuild outcome Failed

Failed Passed

Previous build result

Perc

enta

ge o

f bui

lds

Findings

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build history

b

b’

Page 19: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

19

Findings

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build history

No evidence that either history manipula-tion operations or parallel development to a PR affect the PR’s build outcome.

No evidence that either history manipula-tion operations or parallel development to a PR affect the PR’s build outcome.

Pull request scenarios

Page 20: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

20

Findings

Even objectively harmless changes can break builds. This indicates unwanted flakiness of tests or the build environment.

Even objectively harmless changes can break builds. This indicates unwanted flakiness of tests or the build environment.

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

Build failures mostly occur consecutively. Phases of build instability perpetuate failures.

File types

Build history

577 builds from Spring Boot Changelog file change only 14% original failures

○ 52% test failures○ 45% environment crash○ 3% dependency error

No evidence that either history manipula-tion operations or parallel development to a PR affect the PR’s build outcome.

No evidence that either history manipula-tion operations or parallel development to a PR affect the PR’s build outcome.

Pull request scenarios

Page 21: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

21

Summary

Categorization of error types (beyond failed/errored) Quantification of error type occurrence Statistical analysis of impact factors Uncovered challenges that arise when mining CI data

Page 22: An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

22

Dipl.-Ing.Thomas RauschResearch Assistant

TU WienDistributed Systems GroupArgentinierstraße 8/184-1, 1040, Vienna, AustriaT: +43 1 58801 184 838E: [email protected]/staff/trausch