saner presentation nk_march16
TRANSCRIPT
Botched Releases: Do we Need to Roll Back?
Noureddine Kerzazi Bram Adams&
Agenda
• Context• RQs
▸RQ1. How often do botched releases cause crashes, hangs or performance slowdown?▸RQ2. Can we build a good explanatory model of botched releases?▸RQ3. What are the most important indicators of botched releases?
• Methodology: Data Collection• Results• The Next Step
2
Context3
• When it breaks you got alerted before you know
4
What looks like a Crash?
5
You might think that it's easier to roll back the release
How about Db Scripts that have been applied? How about Configuration Files
that have been changed?How about The Cache Clear?Provisioning of prod. Servers….….
Place your screenshot here
7
What looks like a Hang?
• How many hangs we faced in 15 months?
Place your screenshot here
8
What looks like a performance issue?
Post-Release failures trend9
A Botched Release is ….
10
a release experiencing abnormal
system behaviour (e.g., crashes, hangs or
poor performance) few minutes after being
deployed into the production environment.
Overview of the Release Process
• Worth noticing that:▸We are in the context of parallel development (72+ active branches).▸Relatively large scale deployment (16 + servers)
11
Pre-Release
HotFix
MustFix
Pipeline
Staging Env.
Production Env.
Context
• Web-based financial system▸192 countries and 42 currencies (? laws)
• .Net platform : C#, VB, C, HTMl 5, Angular, …• 1.6 million LOC• 8.524 source code files• 40 projects within the solution• 200+ Devs, BA, DBA, Testers, IT, 1 Architect, 1 Integrator
12
Populating a Release History Database13
2. Results
.
14
15
RQ1. How often do botched
releases cause crashes, hangs
or performance slowdown?
How Often do Botched Releases Cause Crashes or Performance Issues?16
CrashesHang
Performance
RQ1
• 22.5% (72 out of 320) of release were botched.• ~50% of post-release failures that cause system crashes
pertain to the source code• ~20% of system crashes and hangs relate to Configuration
failures • ~25% of hangs and ~12% of crashes are due to problems
with database scripts.• ~19% of hangs and ~12% of crashes are due to platform
changes
17
RQ118
▹Botched Releases are not only related to source code issues.
19
RQ2. Can we build an explanatory
model of botched releases?
Populating a Release History Database20
21
RQ3. What are the most important
indicators of botched releases?
What are the Most Important Indicators
of Botched Releases ?
22 RQ2 & RQ3
#Devs Changing
the code
Purp. of the Branch
Modified Subsystem
#Branch syncs
#rev. to work items
Interval Broken Buil.
Trust
Magnitude
# sub Systems
Log10 Files
Log10 Churn
FactorsWhat matters more is not how codeflows into a release, but rather theamount of development work thatwent into it, the kind of release andwho was responsible for it.
23
Some Threats to Validity
• Mapping data • Metrics chosen to build our explanatory model of botched
releases are based on the experience of release engineers.
24
LET’S REVIEW
YBotched Releases are not onlyrelated to source code issues.
MWhat matters more is not how code flows into a release, but rather the amount of development work that went into it, the kind of release and who was responsible for it.
TTrust is one of the most important factors.
WWhen it breaks, find out why?
RThe random forest model obtains a precision of 88.3% (83/94) and recall of 84.7% (83/98).
NNever compromise in your release practices, no matter how small your release gets.
25Botched Releases
Do we Need to Rollback
WANT BIG IMPACT?USE BIG IMAGE.
26